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ABSTRACT 


A recent area of interest in theoretical computer science has been in the construction 
of so-called pseudo-random bit generators. These generators “stretch” a short sequence 
of truly random bits into a longer sequence of “pseudo-random” bits. These bits 
are sufficiently indistinguishable from truly random bits to be useful in deterministic 
simulation of probabilistic computation. 


Let us say, informally, that a function is one-way if it can be computed in 
polynomial time but no family of polynomial-size circuits can invert it with high 
probability. Yao [Y] has recently proven that if any such function exists, then it can be 
used to construct a pseudo-random bit generator. Furthermore, the existence of this 
generator implies that R C ().59 DTIME(2"). 

No proofs of the results have previously appeared in print. In this thesis, we present 
proofs of these results. In addition, we consider two other types of one-way function. 
The first type is much weaker than Yao’s and we show that if such a function exists, 
it can be used to generate a somewhat less powerful pseudo-random bit generator. We 
then consider a second, much stronger type of one-way function and show that if such 
a function exists, then pseudo-random bit generators can be constructed which imply 


RCP. 
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Chapter 1 


Introduction 


A major goal of theoretical computer science is to find “efficient” algorithms for 
solving various problems. In this context, “efficient” generally means that the algorithm 
runs in time bounded by some polynomial in the length of the problem instance. 
Unfortunately, finding such algorithms has so far been very difficult to accomplish 
in the case of many practical problems. Equally useful to the theoretician, although 
perhaps not the programmer, is a proof that no efficient algorithm exists for a given 
problem. Success in this area has been even less common. 


The theory of NP-completeness has provided an alternative in the latter case. 
Hundreds of practical problems have been shown to be NP-complete and this is 
generally taken as pervasive evidence that they are intractable. 


Recently, an alternative to finding provably efficient methods for solving given 
problems has also been considered. Instead of looking for algorithms which always 
give the correct solution to a given problem, one tries to find algorithms which give 
the correct solution with a very high probability. These “random” or “probabilistic” 
methods have been the focus of much research. 


‘A probabilistic algorithm relies on coin flips in order to make certain decisions 
and its behavior on any given input is therefore not deterministic. Adleman[A] was 
the first to explicitly define the class R (“random polynomial time”) of those problems 
efficiently solvable by probabilistic algorithms. Informally, a problem is in R if there is 
a polynomial time algorithm which solves it with only a very small chance of error. The 
chance that a solution given by such an algorithm is incorrect is just the probability 
that some unlikely sequence of coin flips is produced during its execution. 


Note that any sequence of K coin flips is easily represented as a sequence of K 
bits. Thus, any probabilistic computation can be simulated by a deterministic machine 
which simply runs the probabilistic computation over and over again, trying all possible 
bit sequences of coin flips. Any polynomial time random algorithm can therefore be 
simulated deterministically in exponential time. 


Unfortunately, problems which require exponential time to solve are beyond the 
reach of even today’s fastest computers. As a result, effort. has been focused on methods 
for “stretching” a short sequence of truly random bits generated by coin flips into a 
longer sequence of “pseudo-random” bits, without performing any additional coin flips. 
Ideally, such “pseudo-random” bit sequences should be sufficiently indistinguishable 
from truly random sequencces to be useful in simulating probabilistic compuations. 


So far, every such pseudo-random bit generator which has been proposed is based 
on an unproved assumption that some given problem is intractable. Blum and Micali 
were the first to demonstrate such a generator. Theirs is based on the assumed difficulty 
of solving the so-called “discrete logarithm problem”. Subsequent generators have been 
based on the difficulty of factoring [G,GMT,Y] and the quadratic residuosity problem 
[BBS]. 

Recently, Yao [Y] has proven a much more general theorem concerning pseudo- 
random bit generators. Let us say, informally, that a function is one-way if it can be 
computed in polynomial time, but no family of polynomial-sized circuits can invert 
it with high probability. Yao states that if any such one-way function exists, it can 
be used to construct a pseudo-random bit generator. Furthermore, the sequences 
produced by such a generator are indeed useful for simulating general probabilistic 
computation;. they can be used to simulate any polynomial time probabilistic algorithm 
deterministically in sub-exponential time. It remains as a major open problem to 
demonstrate that any such one-way function actually exists. 


This thesis presents proofs of Yao’s results in Chapter 2. Proofs of the results 
have not previously appeared in print. In Chapter 3, we consider two other types 
of one-way functions. The first type, which is weaker than those of Yao’s, cannot 
be inverted with high probability by any circuit of some fized polynomial size. We 
show that if such a function exists, a somewhat less powerful pseudo-random bit 
generator can be constructed. We then consider a second type of one-way function 
which is very much stronger: no circuit of some fixed sub-exponential size can invert 
it without being mistaken frequently. We then show that if such a function exists it 
is possible to construct a bit generator which can be used to simulate any polynomial 
time probabilistic algorithm deterministically in polynomial time. 


Chapter 2 


Yao’s Theorems 


In this chapter, we discuss conditions under which good pseudo-random bit 
generators can be constructed. We will then show that such generators can be used 
to simulate probabilistic computations deterministically. The results are all due to 


Yao[Y]. 
2.1 Building Generators From One-Way Functions 


Suppose G is a deterministic program which, given some k-bit sequence = (a 
“seed”) as input, outputs some bit sequence by, be, ...,bp(,) where P(k) is a polynomial. 
In order for G to serve our purposes as a useful pseudo-random bit generator we will 
require the following: 


(1) G is efficient. The sequence 6y, bo,... , bp(z) is output in time polynomial in k. 


(2) The output of G is unpredictable. Given the generator G, and the first 7 
output bits b;, bo,...,6; generated from some seed 2 but not the seed z ttself, it is 
computationally infeasible to predict the 1 + Ist bit in the sequence with a better 
than 50-50 chance. . 


This definition was first proposed by Blum and Micali [BM]. 


Let us make the notion of “unpredictability” mentioned in condition (2) above 
more formal: 


Definition Let P be a polynomial, 5; a multiset consisting of P(k)-long bit sequences 
and S =U, 5. A polynomial-size nezt-bit test for S is a family of circuits C = {Ci}. 
Each circuit Ci, has 7 Boolean inputs where i < P(k), one Boolean output and size 
polynomial in k. On input the first 7 bits of a sequence s randomly selected from Sj, 

i will output a bit b. Let Pe; denote the probability that b = the 7+ 1st bit of s. We 
say that S passes the test C if for every polynomial Q, all sufficiently large & and all 
t< P(k): 

1 1 


c 
Pei <9 t Q(k) 


We will refer to both C and Ci, as next-bit tests. 


Now we can state condition (2) of our definition of pseudo-random bit generators 
as follows: 


(2) Let S, be the multiset of sequences output by G on all k-bit seeds. Then Sy 
passes all polynomial-size next-bit tests. (Note that S, may be a multiset since 
two different seeds might cause G to output the same sequence.) 


We formally define a Cryptographically strong pseudo-random bit generator 
following Blum and Micali[BM)]: 


Definition Let P be a polynomial, J, the set of all strings of length k, and Dy C Ik 
a set of inputs (or “seeds”) of length k. Let G be a deterministic algorithm which, 
on input a seed z € Dj, outputs a P(k)-long bit sequence sz in Poly(k) time. Let 
S; = {sz|z € Dy}. The algorithm G is a Cryptographically strong pseudo-random bit 
generator (a P-CSB generator) if the multiset S = U, 5, passes all next-bit tests. 


Every explicit CSB generator which has so far been proposed is based on an 
unproved assumption that some given problem is intractable. Blum and Micali [BM] 
were the first to show such a generator. They based it on the assumed difficulty of 
solving the discrete logarithm problem. For further details on the discrete logarithm 
problem, see [AL]. 


Definition Let p be a prime. The set of integers [1,p-1] forms a cyclic group Z5 under 
multiplication mod p. Given a prime p, a generator g for D5 and y € Zs the Discrete 


Logarithm Problem (DLP) is to find the unique z € Z, such that y = g* (mod p). This 
a is often denoted indez, (y). 


Let C = {C,} be a family of circuits such that C, has 3n Boolean inputs and 
size P(n) for some polynomial P. Think of these inputs as consisting of an n-bit 
prime p, an n-bit generator g for 25, and an n-bit y € Z,: Blum and Micali are able 
to construct a CSB generator under the (unproved) assumption that for every such 
family of circuits C = {C,} and all polynomials P and Q: for all sufficiently large 
n, Ca(p,g,y)  indezpo(y) for at least a fraction 1/Q(k) of the n-bit primes p. Some 
subsequent generators have been based on the difficulty of factoring [G,GMT,Y] and 


the quadratic residuosity problem [BBS]. 


Although the Discrete Logarithm Problem seems difficult, the inverse of the DLP 
is easily solved in polynomial time. Given a prime p, a generator g for Z 4 and z€ Z i 
the function POWER(g, x, p) = g” (mod p) can be calculated by successive squaring 
in time polynomial in the length of g,z, and p. Thus POWER can be thought of as a 
“one-way” function-easy to compute but difficult to invert. Yao[Y] has subsequently 
shown that given any “one-way” function, it is possible to construct a CSB generator. 
Let us make this notion of “one-way” more precise. 


Definition(Yao[Y]) Let I, be the set of all k-bit strings, let Dy C J, and let 
Sx:D, ++ Dy be a sequence of permutations. We will write D = UD, and f = {fx}. 
Then, f is a weak one-way function if the following properties are satisfied: 


(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, selects an z € Dy with uniform probability. 


(2) There exists a polynomial-time algorithm which, on inputs k and z € D,, computes 
fel 


(3) There exists a polynomial @ such that the following holds. Let C = {Cy} be 
any family of polynomial-size circuits where each C, has k inputs. Then for all 
sufficiently large k: 


C,(z) 4 f;'(z) for at least a fraction a6 of the z € Dy. 


We can now state an important result of Yao’s precisely. 


Theorem 1 (Yao[Y]) Given any weak one-way function f, it is possible to construct 
a P-CSB generator for any polyomial P. 


Note the generality of this theorem. It says that given any weak one-way function, 
we can construct a generator which “stretches” a k-bit seed into a P(k)-bit pseudo- 
random sequence, for any polynomial P. Thus we can decide beforehand, up to a 
polynomial, how much “stretching” we want our generator to do. 


Note also one slight difference between the difficulty assumed about inverting 
a general one-way function and the difficulty which Blum and Micali assume about 
inverting the POWER function. In the general definition, one assumes that there exists 
some polynomial @ such that any family of polynomial-size circuits will fail to invert 
a one-way function. with probability at least 1/Q(k). Blum and Micali assume that 
this holds for every polynomial Q. It is, of course, possible to weaken the Blum-Micali 
assumption to conform with the more general case. The resulting generator is somewhat 
less efficient than that in the Blum-Micali paper. 
Proof of Theorem 1 We first show the following lemma, due to Blum and Micali[BM], 
which provides a set of sufficient conditions for constructing CSB generators. We then 
show that, given any weak one-way function, it is possible to satisfy these conditions. 
Lemma 1.1(Blum,Micali[BM]) Let J, be the set of k-bit strings and let Dy C Jy. 
Let g,:D, ++ Dx be a sequence of permutations and let By,:D, ++ {0, 1} be a sequence 
of predicates. We will write D = U, Dy, g = {g,} and B = {B,}. If the following 
set of conditions hold, then it is possible to construct a P-CSB generator, for any 
polynomial P. 
(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 

which, on input k, chooses z € D, with uniform probability. 


(2) There exists a polynomial-time algorithm which, on inputs k and z € D,, computes 
9x(2). 

(3) There exists a polynomial-time algorithm which, on inputs k and z € D;, computes 

———- B(gz(z)). 

(4) Let C = {C,} be any family of polynomial-size circuits such that each C, has k 
inputs and let Q be any polynomial. Then, for all sufficiently large k: 

2 

Qk) 


Proof of Lemma 1.1 First we construct the P-CSB generator and then prove that 
its outputs must pass all next-bit tests. 


C,(x) # B,(x) for at least a fraction ; _ of the z € Dx. 


Choose an appropriate value of k to be the seed length and choose in probabilistic 
polynomial time a random z € Dy, to be used as the seed. Set c = P(k), the desired 
length of the output sequence, and generate the bits: 


By(94(2)), Be(9¢(2)),---, Be(ge(2)). 
The notation gh indicates the j-fold composition of g,. Now, output these bits in 


reverse order, i.e.: 

Bi(si(2)), Bi(9¢ *(z)),.--1 Be(9x(2)). 
It should be clear that all of this can be accomplished in polynomial time by conditions 
(2) and (3). 

It remains to show that the sequences output by this generator pass all next-bit 
tests. Suppose tuat this is not true. Then there exists a polynomial Q and a family of 
Poly(k) size circuits C = {Ci} where each C}, has i < P(k) inputs and the following 
holds. For each of infinitely many values of k there exists some 7 such that: 

1 


ee 
Pei 2 + OB 


where pe, is the probability that the circuit Ci, outputs the correct 7+ Ist bit of a 
sequence when given the first 7 bits as input. 


We now construct a polynomial-size family of circuits A = {A;,} where each A, 
has k inputs and such that for infinitely many values of k, A, correctly computes the 
predicate B,(x) with probability at least 4 + ati: This contradicts condition (4). 


Choose one of the infinitely many values of & such that for some t < P(k): 


The circuit A; uses Ci, as a “subroutine”. On input 2 € Dy, the circuit A, first 
generates the 72-bit sequence: 


Bi(9¢(2)), Be(gt *(2)),---»Be(ge(z)) 
and inputs this sequence to the next-bit test circuit Ci. Note that this can be 


accomplished with a polynomial number of circuit gates by conditions (2) and (3). The 
circuit A; then outputs whatever value Ci, outputs on these bits. 


Note that the bits: 
By(9i(z)), Belge '(z)), «+1 Be(ge(z)) 
are the first 2 bits of the CSB sequence: 


By (gi (z)),---, Be(ge(z)), Be(z), ---, Be(ge “(2))- 
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_ Since the 7 + Ist bit of this sequence is B,(zx), the circuit A, will correctly compute 
B,(z) whenever the circuit Ci, correctly outputs the 7+ 1st bit. Furthermore, the seed 
of this peqence is g(x ) ai since g; is, by assumption, a permutation, so is get 
Thus, gj, ° 1 generates all possible seeds z € D; which means that the next-bit circuit 
ct will correct output the 7 + 1st bit of this particular sequence for a aah at 
least $ + Oty" Thus, the circuit A, computes B,(zx) for a fraction at least } + 1, Ole) of 
the x € Dy which contradicts condition (4). This completes the proof of the lemma. 


It remains to show that, given some arbitrary one-way function f = {f,} over an 

accessible domain EF = U; Fy, it is possible to construct a new domain D = J, D,, 
a function g = {g,} and a predicate B = {B,} which satisfy the four conditions of 
lemma 1.1. This is proven in the following lemma due to Yao. 
Lemma 1.2(Yao[Y]) Let f = {f,} be a weak one-way function over an accessible 
domain E = U, Ey. Then, by definition, for each k, given any z € D, it is possible to 
compute f,(z) in polynomial time and there exists some constant d with the following 
property. Given any family of polynomial-size circuits C = {C,} where C, has k 
inputs, we have, for all sufficiently large k: 


- C,(z) 4 f(z) for at least a fraction a of the z € D,. 
The following construction satisfies conditions (1)-(4) of lemma 1.1: 


(a) Set D, to be the cartesian product of ck copies of E;, where c is a constant which 
depends on k and the constant d mentioned in the statement of this lemma. The 
exact value of c will be determined later in the proof. Suffice for now to say that 
it is polynomial function of k. Formally: 


Dy = { (21, 22,.-+, Zek) | 21 € Eg, .-., Zee € Ey}. 


(b) Let gx( (z1, TQ, 000, Zek) ) — ( fi(z1); Si (x2), very fi (tex) ) where each z; € Ey. 
(cl) Let Bj(z) = the ith bit of fj '(z) where z € Ey. 
(c2) Let 
B,{ (x1, Z2,. +1 Tek) = = ® ® Bi( (Z(i—-1)+5) 
t=1 j=1 
where each z; € D, and “@” denotes the “exclusive-or” operation. 
Proof of Lemma 1.2 It should be fairly obvious that conditions (1), (2) and (3) of 
lemma 1.1 are satisfied by this construction. The domain D, is certainly accessible 
since, by assumption, FE, is accessible. Given any (x1,22,..., 2-4) € Ex, the function 
gx( (21, 22,..-,Zeg)) can be computed in polynomial time since, by assumption, each 
f,(z;) is computable in polynomial time and, as mentioned, the constant c is polynomial 
in k. Finally, the predicate: 


B,( ge(x1, 22,.--, Zek)) = Bil Fela), fe(z2),.--s fe(tee))) 
aad @ ® Bil fel Ze(i— 1)+3)) 


t=1j=1 


can easily be computed in polynomial time since for any z € Dj: 


Bi( f,(x)) = the ith bit of f¢3(fe(z)) 
= the :th bit of z 


The more difficult task is to prove that, as constructed, the predicate B, satisfies 
condition (4) of lemma 1.1. Let C = {C;} be a family of polynomial-size circuits 
where each circuit C; takes inputs from the domain D, constructed in (a) above. For 
every such family of circuits {C,} and every polynomial Q(k), we will show, for all 
sufficiently large k: 


Cy (21, 22,-.+) Zee) ) FA Be (21, 22,---, Zek)) 


1 
for at least a fraction — — 


1 
2 Q(k) 


of the (z,22,..., 2x) € Dy. 


Actually, we will choose to formulate this problem slightly differently. For every family 
of circuits {C;,} and every polynomial Q(k), we will show, for all sufficiently large k: 


Prob[Cx( (21, 22,.-+,%ck)) = Bg( (21, 22,---,2ek))] < ; _ at 


where (21, Z2,...,2,,) is a randomly chosen element of D,. This will be proven through 
a sequence of three lemmas. 


Note one small point. In order to be absolutely consistent, we should actually show 
that the probability of any circuit C, correctly solving the predicate DB, is less than 
$+ ae since the length of each element in D, is ck? bits. However, since Q(ck’) is 
polynomial in Q(k), it suffices to prove the result as stated. 


Lemma 1.2.1 Suppose that there exists a polynomial-size family of circuits C = {C,} 
where each circuit C;, takes inputs from the domain D, and such that for all k and all 
(21, 22,..-, 2x) € Dy we have: 


Cy((21,22,--+)Zck)) = Be( (21, 22,---) Lek) )- 


Then, there exists a polynomial-size family of circuits A = {A,} where each A; has k 
Boolean inputs such that for all k and all z € Ey we have: 


Ai(z) = fz"(2): 


This lemma just says that if we can easily solve the predicate B, on all inputs, 
then we can easily invert the assumed one-way function f; on all inputs. Although 
we do not actually need the lemma in order to show our desired result, its proof will 
shed some light on why the fact that f; is hard to invert implies that B, is difficult to 
compute. 
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Proof of Lemma 1.2.1 Let us examine the predicate By a bit more closely: 


k 


B,( (21, 2 Zck)) =-O® @ Bi( Te(i-1) +3) 
= Bia) @ By (x2) @ --- @ Byte) (1) 
@ BF (2.41) @ Bi(ze+2) B+: PB Be (zee) (2) 
® BR(22041) @ By(t2c+2) B+ O BE(z3e) (3) 


® BE(2-1)e+1) B BE(ae-1)e42) B+ B BK (sxe): (k) 


Note that the predicate Bl is applied to inputs 2; through z, (line (1)), B? i 
applied to inputs 2,4; through zg, (line (2)), and so on, down to line (k). Thus, 
B,( (21, 22,.-., Zck)) is just a big exclusive-or of the first bit of f,1(z,) for j =1 toc, 
the second bit of f;1(z;) for j = +1 to 2c, and so on, down to the kth bit of f,'(z,) 
for j = (k—1)c +1 to ke. 

So, suppose we are given some z € D; and wish to compute f;‘(z). We will 
calculate each bit of fi) separately. To get the first bit, choose some random 
elements zg € Ey, 23 € Ey,...,2ck € Ey and calculate u where: 


u= on (x, fx (22), fx(z3), seey fe(2ck)) ). 


(Remember the circuit C, correctly computes the predicts B,, on all inputs.) This is 
just a big exclusive-or of B}(z) (i.e., the first bit of f;'(z)) and a series of terms of the 
form Bi(f;(z;)). Recall that By ‘fe 2,)) 3 is just the ith bit of z;, thus the exclusive-or 
of all of these latter terms can be computed in polynomial time. More formally, if the 
exclusive-or of all the Bi,(f,(z;)) terms is denoted v then: 


v= @ Bi ful 2;)) y® © Bi f(z (z5- i)+3)) 


j=2 t=2 ie 
c ¢ 
= © (first bit of z,) ® © (2th bit of 2,(;_1)+5) 


and v is computable in polynomial time. Thus, if we exclusive-or u with uv then 
everything is “cancelled out” with the exception of Bj(z) which is just the first bit of 
fk 1(2) as desired. Formally: 


u@v= Bi(z) 
= the first bit of f;(z). 


Since Cy is, by assumption, a polynomial-size circuit and v is polynomial-time 


computable, it is possible to get the first bit of f;'(z) with a polynomial number of 
circuit gates. 
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It should be clear that the second bit of f i3(z) can be computed similarly. Suppose 
we choose 2; € Ej,...,%- € Ey, 2e+9 € E,..-., 2x € Ey and compute: 


Ci( (f,'(21), tee fi (ze); z, fi? (ze+2), eeey fi! (zek)) ). 


This is just a big exclusive-or of B2(z) (the second bit of f;1(z)) and a series of terms 
of the form Bi(f,(z;)) each of which can be calculated in polynomial time. Thus, using 
the same trick as before, it is possible to “isolate out” the second bit of f i (2) with 
a polynomial number of circuit gates. The remaining bits can be computed similarly. 
The following algorithm can easily be transformed into a polynomial-size circuit A, 
with k Boolean inputs such that for all  € Ey we have: 


Ax(z) = fz (2). 


Algorithm Al 


input: z € Ey 
output: y € E; such that y = f;}(z) 


(1) Set y = null 
(2) For h =1 to k do 


(2.1) Choose 21, 22,..., e(h—1)s Ze(h—1)420 +++ 9 Tek 
all elements of E; 


(2.2) Set u = O;( (Fi (21), ++» Fe(Ze(h—1))> Bs Se(Ze(h—1)+2)) +++» Se(Zek)) ) 


(2.3) Set v = @hx} jaalith bit of 2_1)+;) 
j=2 (hth bit of Ze(h—1)+3) 


Oien+ @jai(tth bit of Ze(i—1)+3) 
(2.4) Setw = u Mv 


(2.5) Set y = w o y (0 denotes concatenation) 
(3) Output y 

Note that this algorithm need not actually choose a new set of ck — 1 elements in Ej 
each time it calculates a new bit, as specified in line (2.1). This has been done here 
for clarity of exposition. In fact, when this algorithm is converted into the circuit A,, 
any set of ck — 1 elements in FE, can simply be “built-in” to this circuit. These same 
elements can be reused to get each of the k bits of f;'(z). This completes the proof 
of Lemma 1.2.1. Since f;, is assumed to be a one-way function, we have, in fact just 
shown: 


Corollary 1.2.1 No family of polynomial-size circuits C = {C,} can compute the 
predicate B = {B,} on all inputs. 
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Before going on to the next lemma, we introduce some notation. We can think of 
each (21, 29,...,2¢x) € D, as consisting of k “groups”, each containing c elements. The 
first “group” consists of 21, 72,...,2-, the second “group” consists of 2.41, Ze42)--+ Z2ey 
and so on until the kth group: 2(4_1)c+1) 2(k—1)e+21 +++» Zke- AS we saw before, the value 
of By(z1,22,..., x) is just a big exclusive-or of terms of the form Bi(z;) for each 2; 
in the first group, B2(z,) for each z; in the second group, and so on, until BE(z;) for 
each z; in the kth group. Thus, By is just a big exclusive-or of the zth bit of f i’ (2;) 
for each z; in the 2th group, where 7 ranges from 1 to k. 

Definition Let C = {C,} be a family of circuits where each C;, takes inputs from the 
domain D,. Think of the circuit C, as trying to compute the predicate B,. Now, for 
each x € Ey define the probability sf,(z) as follows: 


sf,(z) ==Prob[Cy( (x1, 22,..., tek) ) = By( (21, 22,.--, Zek))] 
where (2, 22,...,2,-%) is randomly drawn from the elements of Ey 


which have z in the 7th group. 


In other words, suppose that (x1, 29,...,2,%) has z in the ith group, i.e., z = at least 
one 2; for 7 = (¢—1)c +1 to ic. Then, sf,(z) is just the probability that C, will 
correctly compute the predicate B, on the input (21, 22,..., 2,4). Intuitively, if sf,(z) 
is large, then we have a good chance of being able to get the ith bit of f;1(z) by 
repeatedly using the circuit C,. If sf,(x) is large for every value of 7 from 1 to k, then 
we have a good chance of being able to get all the bits of f 7 (2) by repeatedly using 
the circuit C,. This last statement is made precise in the following lemma. 


Lemma 1.2.2 Let C = {C,} be a family of polynomial-size circuits where each circuit 
C;, takes inputs from the domain Dx. Suppose that there exists a polynomial V(k) 
and a constant d such that for all sufficiently large k we have, for a randomly chosen 
ce Ey: 


7H for every value of :from 1 to k| > 1—-— = 


1 
Prob sf,(z) > a id" 
Then, there exists a family of polynomial-size circuits A = {A;,} where each A, has k 
Boolean inputs and for all sufficiently large k: 


A;(z) = f,'(z) for at least a fraction 1— a of the z € Ey. 


In other words, there exists a family of polynomial-size circuits which invert f, with 
high probability if, for a randomly chosen z € Ey: 


Probj s{,(x) > s+ hee 


Vik V(k)| = 


—* 


We will show in the proof of this lemma that if sf,(z) is large then we can,with a 


+, and sf ,(z) > 5+ Tb > 1- 


1 


ke 


polynomial-size circuit, correctly compute the ith bit of f;'(z). Thus, if sf,(z) is large 
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for every value of 7 from 1 to k, then we can, with a polynomial-size eureult, compute 
all of f;/(z). The lemma simply says that if, for all sufficiently large k sf C (zx) is large 
for most of the xz € Ey, then we can, for all sufficiently large k , invert fe on most of 
the x € Ey. 


Let us first see now to compute the first bit of f;'(z) for some z which has the 
property that s¢(z) > a+ 7H: On all inputs (2), 22,..., 2%) € Ey, which have z in 


VE 
the first group we know that: 
1 1 
Prob[C;( (z1, TQ, 200, Lek) ) = B,( (x1, DQ, ee, tez))] > > a7 Vik)’ 
Given (21, z2,..., ek) € Ex, there are c elements in the first group namely, 21, 22,..-, Ze. 


Among these dlerenits. let us say that 2; occupies pesven: j. Then, since C; correctly 
computes the predicate B, with probability at least 2 s+ VE viH whenever z is in the first 
group, there must be one position j from 1 to c such that C;, correctly computes By, 


with probability at least 3 $F ey 765 whenever z is in position 7. Since this position will 


be “built-in” to the circuit which computes fi'(z), we may assume, without loss of 
generality that it is position 1. 


Now, just as in Lemma 1.2.1, we begin by choosing some random elements 
xq € Ey, 23 € Ey,..., Zee € Hy and compute u where: 


i = C;( (z, fi(z2), fx (x3), ae) fi(Zex) ). 


Also just as before, we compute v, a big exclusive-or of terms of the form Bi( fx(z;)): 


c koe 
= @ Bi(Se(z;)) @ , D Bi i(fa(2 Ze(i-1)+3)) 


e koe 
= O (frst bit of z,) )@ D( ith bit of 2,(;_1)43) 
jJ= t=2 j=1 


Finally, we calculate u @ v. If Cy correctly computes B, on all inputs (as assumed in 
Lemma 1.2.1) then, as we saw earlier, u @ v is the first bit of f,°(z). Here, however, 
we know only that C, succeeds with high probability when z is in position 1. In fact, 
C;, may have computed the predicate B, incorrectly on the chosen input, in which 
case u () v may well not be the correct first bit of f 7 (2). However, if we recompute 
uv a number of times using a new set - i ©9,23,..-;Zex each time, we should 


expect to get the correct first bit about } 5+ acc of the une. Saying it differently, if 


the average value of uQ@ v is at least } t+ vii, then the correct first bit is probably 1, 


whereas if this average value is at most 4 as then the correct bit is probably 0. 


70) 

This method of repeated recalculation using different inputs is known as “sampling” 
and each recalculation is called a “trial”. Clearly, the more trials we perform, the 
more certain we will be about which bit to choose. Let avg be the average value of 
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u @v computed during sampling to determine the first bit of f,)(z). Let avgtrue 
_ be the value of avg which we would obtain if we performed a trial for every possible 
distinct set of ck —1 values z2,23,..., 2% (i-e., avgtrue is the true value which avg 
is estimating). Now, suppose that we perform enough trials to be certain that avg is 
within 1/(2V(k)) of avg;. Then, if the first bit is actually 1, we must have: 


avg > - : 3+ Tim . jens 
9= 9° Vk)” 2V(k) 
50 
1 1 
> 
ee > ae 


Alternatively, if the first bit actually 0, we must have:. 


1 1 1 
1—avg 2 9+ TH - BV) 
s0 1 1 
avg < = 2” aV(k)’ 


Thus, if we perform enough trials to be certain that avg is within 1/(2V(k)) of avgtrue 
then we can, with certainty, choose the correct first bit of f(z). We show below that 
this can be achieved with only a polynomial number ss trials. 


It should be clear that, for any i, if s°,(z) > } a) 
compute the a bit of fr k “(z) using the same ce Thus, for each z € E; such that 
sf (x )>4+4 7 : for every? from 1 to k, we can compute fy 1(z) with a polynomial-size 
circuit. Since the hypothesis a this lemma assumes that, for infinitely many values of 
k, this is true for at least 1— i of the z € Ey, we can construct a family of polynomial 
size circuits which, for infinitely many values of k, compute f; Vz ) for at least 1 — b 
of the z € Ey. 

It remains only to prove that we can calculate avg to within 1/(2V(k)) of avgtrue 
with only a polynomial number of trials. If we perform t trials of the sampling procedure 
describe above to compute avg then, by the Central Limit Theorem (see [HPS]) we 
have: 


then we can, with certainty, 


Prob(|avg — avgtrue| > c) = 2(1 — &(6)) 


where 


& = 2cvt 


and 


co 1 -u? /2 
G(z) = —e du. 
~° /2n 
Note that we can never be certain our estimate is within the bound c but only sure 


with arbitrarily high probability. If we wish to know how many trials are necessary in 
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order to be sure with probability € that avg is within a bound c of avgtrue, we must 
solve the following equation for t: 


— ®(2cV1)) < . | (1) 


In fact, when our procedure is converted into a circuit, we will be able to “build-in” 
a particular set of t sets of inputs which can be used in sampling to assure that our 
estimate avg is off by at most 1/(2V(k)). By calculations which can be found in the 
Appendix, equation (1) above is satisfied if t is chosen such that: 


In 2— Ine 


t 
o 2c? 


Here, we want c = 1/(2V(k)) and for reasons which will be made clear below, we set 
€ = 1/2* giving: 
t > (In2—In(277*))(2V(k)). 
This is certainly satisfied by: 
_t > 6kV(k). 


We now show that, since t is polynomial in k, we can build a ra) size circuit 
which correctly computes f;/(z) for every 2 € E, such that sf,(z) > 5 + 70: The 


argument is probabilistic. Suppose that t = 6kV(k) and let 


W = {input,, inputy,..., input,} 


be a set of t randomly chosen inputs where each input; is a set of ck — 1 elements of 
E; which will be used for sampling,i.e., for each 4: 


input; = {z1, TQ orey Tek-1} each tj € Ey. 


For a randomly chosen string z € E; such that sf,(z )>3 5 + 7 for every value of 
1 from 1 to k, let us say that the set W fails on z if, for i Gat one value of 1, the 
value of of avg computed while trying to get the zth bit of f; Ma ) differs from avgtrue 
by more than 1/(2V(k)). Since the chance of failing on any given bit of f;!(z) is less 
than 1/27* (because « = 1/2?* above) and there are k bits we get: 


Prob[W fails on x] < 5ak 


Since there are at most 2* elements in Ej: 


k 
Prob[W fails on at least one zt € Ex] < = : 


Thus: k 
Prob[W does not fail on any z € Ey] > 1-— at > 0. 


So, there must exist-some set W of ¢ inputs which can be used in sampling to assure 


that for every x € E, such that sf,(z) > 4+ TOR the value of f; '(z) can be correctly 
computed with certainty. This set W can simply be “built-in” to the circuit A, which 


inverts f,. The entire algorithm for computing f 7 1(z) follows on the next page. 

Let be such that sf,(z) > }+ Ww: Then, by definition, the circuit C, 
correctly computes the predicate B, with probability at least + a0] on all inputs 
(21, 22,.-.,%ex) € Dy which have z in the ith group. As before, given such an input, 
we say that element z; occupies “position” 7 in the input. Since the zth group consists 
of elements z,/;1)41 through z,;, there must exist some position 7 from c(t —1)+1 
to ic such that C, computes B, with probability at least } + aC) whenever 2 is in 


position 7. This position will, for each 7, be “built-in” to the circuit A, which inverts 
fi. We have assumed in the following algorithm, without loss of generality, that it is 
position c(i — 1) +1, i-e., the first position of each group. 
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Algorithm A2 


input: z € E; 
output: y € Ey such that y = f;'(z) with probability 1 — 3 


(1) Set y = null 
(2) For h = 1 to k do 
(2.1) Set count = 0 
(2.2) Set t = 6kV(k) 
(2.3) For sample = 1 to t do 


(2.3.1) Choose next “built-in” input for sampling: 
Z1,2Q,.00y Ze(h—-1)) Ze(h—1)+2) aeey Dek 


(2.3.2) Set u = C;( (fx (21), ee) Se(zen—1))s 2, fe(Ze(n—1)+2)) tee fx(2ck))) 


(2.3.3) Set v = @P] @f_i (ith bit of 4,145) 
Djane (Ath bit of 2.(,-1)+) 
Ding Ojai (ith bit of 2-145) 


(2.3.4) Set count = u @ v 


(2.4) Set avg = ne 


(2.5) favg > 4+ WH then set w = 1 


(2.6) Elseif avg < 4, - - WH then set w = 0 


(2.7) Set y == wo y (o denotes concatenation) 


(3) Output y 

It should be clear that this algorithm can easily be converted into a polynomial size 
circuit A, with k Boolean inputs. This completes the proof of Lemma 1.2.2. Note that 
J, is assumed to be a one-way function which means that, for some constant d, no 
family of polynomial-size circuits can invert it with probability at least 1 — aa Thus 
we have actually shown: 

Corollary 1.2.2 Let f = {f,} be a polynomial time computable function which 
takes inputs from the domain FE, and suppose that there exists a constant d such 
that any family of poly nama) ane. circuits which tries to invert f; is, for infinitely 
many k, correct for less than 1 — ii of the z € Ex. Let C = {Cy} be any family of 
polynomial-size circuits where sacle ‘cicuil C; takes inputs from the domain D, (i.e., 
each circuit C; is trying to compute the predicate B,). Let V(k) be any polynomial. 


Then, for all sufficiently large k we have, for a randomly chosen z € Ey: 


1 
V(k) 


for every value of : from 1 tok| < 1——. 


ka 


Prob sey (2)> 5+ 


The following lemma completes the proof of Lemma 1.2 and therefore also completes 
the proof of Theorem 1. 


Lemma 1.2.3 Let J; be the set of all k-bit strings and let f = the} be a weak one-way 
function over a domain E; C J;. Thus, there exists a constant d such that any family 
of Pon nemiaheins circuits which tries to invert f; is, for infinitely many k, correct for 
less than 1 — a of the z € E,. Let D, be a new domain consisting of the cartesian 
product of k?4 copies of Ey, i.e.: 


Dy = { (21, 22,..., Zg2e) | 21 € Ey, ..., Dpre € Ex} 


and let B, be the predicate over the domain D, which is described in Lemma 1.2. 
Note that we are finally giving the the value of the constant c which has been used 
until now to specify inputs in D,; ¢ is equal to k?¢-1. Now, let C = {Cy} be any 
family of polynomial-size circuits where each circuit C, takes inputs from the domain 
D, and tries to compute the predicate B,. Let Q(k) be any polynomial. Then, for all 


sufficiently large k we have, for a randomly chosen (x1, 22,..., 2,24) € Dx: 
Prob[ Cx( (21, 22 p24) ) By((21 z2 424) )] < : + ; 

9%2) 000 Dra) | = 922) -0 +5 LRed 57 PIL\’ 

: ; 2° Q(k) 


Proof Let V(k) be any polynomial. Then, from Corollary 1.2.2, for all sufficiently 
large k we have, for a randomly chosen x € Ey: , 


1 1 1 
b} 5%, page aes : ia 
Prob} s,;,(z) < at Vib) for some value of : from 1 to k| > id 


Then, as shown by Goldwasser[G], there exists a polynomial Q(k) = aV(k) for some 
constant a such that, for a random (2), 22,..., 2,24) € Dy: 

zis 

Q(k) 


This is a somewhat complicated counting argument so the proof is omitted here. 
Details can be found in [G]. 


1 
Prob[Cy( (21,22, ...,,2«)) = By((21,29,..., 2y24))] < at 


2.2 Statistical Tests for Strings 


It turns out that in order to be useful for simulating general probabilistic 
computation, we would like the output from a “good” pseudo-random bit generator . 
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to be unpredictable in a slightly different way from that-discussed so far. Until now, 
we have required that, given any prefix of a sequence output by our generator, it 
be computationally impossible to consistently predict the next bitin the sequence. 
Consider a somewhat different notion of unpredictability: Given some n-bit sequence 
output by the generator and some truly random n-bit sequence, it is computationally 
infeasible to distingush the two sequences with a better than 50-50 chance. Note that 
this kind of unpredictability involves the impossibility of distinguishing between two 
sequences whereas the next-bit test involves the impossibility of predicting a given bt. 
The following definition formalizes this: 


Definition Let P be a polynomial, 5; a multiset consisting of P(k)-long bit sequences 
and S = Uj, S;. A polynomial-size statistical test for strings is a family of circuits 
C = {C;}. Each circuit C, has P(k) Boolean inputs, one Boolean output and size 
polynomial in P(k). The multiset S passes the test C if for every polynomial Q, and 
all sufficiently large k: 


Ipé — pe |< ~~ oe 


where pp denotes the probability that C, outputs 1 on a randomly selected s € S; 
and pi denotes the probability that C, outputs 1 on a randomly selected P(k)-long — 
bit sequence. We will refer to both C and C; as statistical tests. 


It turns out, as proven in the next theorem of Yao’s, that these two notions of 
unpredictability are equivalent. Thus, the output of any CSB generator passes all 
polynomial-size statistical tests. 


Theorem 2 (Yao[Y]) Let P be a polynomial, S, a multiset consisting of P(k)-long 
bit sequences and S = (J), S;. The set S passes all polynomial-size next-bit tests if and 
only if it passes all polynomial-size statistical tests. 


Proof The easy half is to show that if S passes all polynomial-size statistical tests then 

it passes all polynomial-size next-bit tests. Suppose S = UJ; S; fails a polynomial-size - 
next-bit test. Then, by definition, there is a family of Poly(k) size circuits C = {Ci} 

such that each Ct, has i < P(k) Boolean inputs and one Boolean output. Furthermore, 

there exists a polynomial Q such that for each of infinitely many values of k there 

exists some 7 such that: i 1 | 


phi >= 5 + om 


Recall that rf; is the probability that the circuit Ci, outputs the correct 1 + Ist bit of 
a sequence s € 5; when given the first 7 bits of s as input. 


We now show that S must fail a statistical test A = {A;,} where each circuit A; 
has P(k) inputs and size Poly(P(k)). This statistical test simply uses the next-bit test 
as a subroutine. Suppose k is chosen such that the next-bit test succeeds with high 
probability on strings in S,. Then, there must be a value of 2, such that the circuit Ci 
predicts the z:th bit of strings in S, with high probability. The statistical test circuit 
A; operates as follows. It inputs the first ¢ bits of its own input to the next-bit test 
circuit Cj. It then compares the bit predicted by Ci, with the true 7 + Ist bit of its 
own input. If the circuit Ci. predicted correctly, then the statistical test A, outputs a 


1, otherwise it outputs a 0. Now, the next-bit test Ci succeeds with high probability 
given the first ¢ bits of a string in S, but can only succeedat most half the time on truly 
random 71-bit strings. Thus, the statistical test A, will effectively distinguish between 
sequences from S; and truly random P(k)-bit sequences. — 


More formally, let pp denote the probability that A, outputs 1 on a randomly 
selected s € 5; and pf denote the probability that A, outputs 1 on a randomly selected 
P(k)-long bit sequence. Then, by the argument above, we get for infinitely many pairs 
k and 2: 


PE = pe; 
and i , 
5 ee Se 
Pei = 9 * OH) 
but 
u n_t 
PE = 5 
80 


SUR 1 
- >—_—~. 
Pk Pr 2 Q(k) 


Thus, S fails the statistical test A = {A;} and this half of the theorem is proven. 


The more difficult task is proving that if S = U S;, passes all next-bit tests then it 
passes all polynomial-time statistical tests. The idea is as follows. Suppose S fails some 
statistical test C = {C;,}. We want to construct a next-bit test for S. For infinitely 
many values of k, the circuit C;, can distinguish between elements of S,; and random 
P(k)-bit strings. The first obvious question is: Which of these P(k) bits should our 
next-bit test attempt to predict? It turns out that it is possible to find a value of + 
which has the following interesting property: If the circuit C; is given as input the first 
t bits of some string in s € S; followed by a random sequence of P(k) —1 bits, it can 
detect with a certain probabilty whether the z+ 1st bit of its input is the correct ++ 1st 
bit of s. By repeating this experiment a number of times with different sequences of 
P(k) —1 bits, it is possible to ascertain the 1+ 1st bit of s with a better than 50 percent 
chance. A circuit Ai, which has the statistical test C, "built in” can be constructed 
to repeat this experiment the appropriate number of times and thereby constitute a 
next-bit test. 

Formally, suppose that a multiset S = UJ, S; fails some polynomial-time statistical 
test. Then, by definition, there is a family of polynomial-size circuits C = {C;,} such 
that C, has P(k) inputs and one output. Furthermore, there exists a polynomial Q 
such that for infinitely many values of k: 


1 
§ R 
Se > —— 


where pf is the probability that C, outputs 1 on a randomly selected sequence in 
S; and pf is the probability that C, outputs 1 on a randomly selected P(k)-long bit 
sequence. We now show that S must necessarily fail a next-bit test. 
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Let [z|i]y denote the concatenation of the first t bits of. x with y. For each t < P(k), 
let pj. denote the probability that C;((z|2]y) = 1 where z is chosen randomly from S; 
and y is a random string of P(k)—1 bits. Thus, Pk is the probability that the statistical 
test C, outputs 1 when given as eae the first : a of some string in 5S; followed by 
P(k) —i random bits. Note that p? = pf and pi! = pf. 

We now construct a family of circuits A = {At} which constitute a next-bit test 
for S. Each Ai hast < P(k) inputs and size polynomial in P(k). Let k be chosen such 
nat 


SUR 1 
_ > —-, 
\pé Pr | Zz a(r) 


Assume, without loss of generality, that 


S_R di 
= > —., 


Then, clearly, there must be a value of 2 such that 


#02 (aqme) 


(Remember, 7 varies between 0 and P(k).) The circuit Aj will correctly predict this 
a+ 1st bit of any sequence in S, with high probability. 

Recall that p ptt is the probability that C, outputs 1 on any string consisting of 
an 1+ 1-bit initial segment of a string in S; followed by P(k) —(t +1) random bits. 
Let ge be the probability that C, outputs 1 when given the first 7 bits of a string in 
Si, followed by the incorrect ¢ + 1st bit, followed by P(k) —(¢+ 1) random bits. Note 
that: 


1 Pode 
$_ Ty s+l | 2 +l 
Pk = 5P + 94 
thus: 
. 1 1 1 
t+1 = itl eyttl 1 opt 
Py Pk = P; — (50k oe 9 Tk ) 2 (aa)(x5) 
1 t+1 t+1 1 1 
= iy permatiene | Apion 
gre 9% = \O®NPH 


80: 

pit} > git}. 
Consider in particular line (1). The fact that the difference between pt} and qi} is 
large means that if C, is given as input the first 2 bits of some string s € S; followed 
by P(k) —1 random bits, it can effectively distinguish between those strings containg 
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the correct 7 + 1st bit of s in their 1+ 1st position, and those containing the incorrect 
2+ 1st bit. 

Since pit! + git! = 1 and pit! > gi*!, if we can be sure of correctly predicting 
the 1+ Ist bit of a randomly drawn string in S, with probability p;*}, then we will be 
correct more than half the time. The next-bit test Aj} is constructed to do this. 


So, suppose Ab is given as input 1, bo,...,b;, the first « bits of a randomly chosen 
sequence s € S,. Let b be the correct 1+ 1st bit of s and let 6 be the incorrect 1 + 1st 
bit. The circuit Aj first calculates the probabilities: 


i= Prob] Ci(b1, ++, 0;,b, bi+2, waraty bpre)) = 1] ; 


rm Probl Cx(bs, 24 bisBybigay- ++ bp) = 1] 

where };49,...,bp(,) are chosen randomly. In fact, these probabilities can only be 
estimated with high accuracy but for the sake of clarity, let us assume for a moment 
that they can be calculated exactly. A discussion on estimating the probabilities 
appears later in the proof. Note that At does not actually know which is the correct 
and which the incorrect 7 + Ist bit. It simply calculates the two probabilities—one of 
them will be r, the other, r;. The predicting circuit Aj now chooses ;,; = 6 with 
Te ; 


probability —"*— and chooses b;,, = b with probability Saat 


Toth 

Suppose we calculate the value of r, for each 7-bit prefix of a string in S; and take 

the average of all these probabilities. This average is just the total probability with 
which the circuit At correctly predicts the 1+ 1st bit of a randomly drawn string in S,. 
This average is also equal to pit} so Ai predicts correctly with probability pitt >t. 
Thus there must exist some polynomial R such that A} correctly chooses the 7 + Ist 


bit for sequences in S, with probability at least } + On The set S therefore fails 


the next-bit test A = {Ai} and this part of the proof is finished. A discussion on 
estimating the probabilities r, and 1, follows. 


Estimating the probabilities 
Recall the probabilities r, and r; which are used to predict the ¢ + Ist bit: 


r= Prob| Cy(b1, ++ Bind, Bi49)---, Opa) = 1 


(= Probl Cx(b, siete , 53; 5, Pee bpr)) _ 1| 


where 6 is the correct 1 + 1st bit and 6 is the incorrect 2 + Ist bit. As mentioned above, 
these probabilities cannot be calculated exactly. Here we show that they can, however, 
be estimated closely enough so that the next-bit test will still succeed with more than 
a 50-50 chance. 
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The idea is as follows. Suppose the next-bit test circuit is given as input the bits 
b;,...,6;, an %-bit prefix of some string in S;,. In order to estimate the probability r, 
the next-bit circuit first chooses a sequence of random bits b;+9, ..., bpj,) and then uses 
the statistical test C, as a “subroutine” to calculate whether 


Cp( bi, ba, ... diy, biga,---, bpay) 


causes a 0 or a 1 to be output. The next-bit circuit then repeats this same procedure a 
number of times using different random sequences 6;+9,..., bpix). Finally, it takes the 
average of all the 0,1 values output by the circuit C, and uses this as the estimate for 
r,. The estimate for r; is computed similarly. As we saw in lemma 1.2.2, this method 
is known as “sampling” and each repetition of the procedure is a “trial”. Note that 
the next-bit circuit does not actually know whether b = 0 or 6 = 1. It simply does 
this “sampling” first with b = 1 and then with b = 0. One of the results will be an 
estimate for r,, the other an estimate for r;. 

Clearly, the more trials we make the closer the resulting estimates will be to r, 
and r;. The question is, for each input of ¢ bits, how far from r, and r; can these 
estimates be and still assure that the next-bit circuit will correctly predict the + + 1st 
bit more than fifty percent of the time? 

Recall that pei is the probability that the statistical test C, outputs 1 when given 
as input the first 1+ 1 bits of a sequence in S; followed by P(k) — (t+ 1) random bits. 
Also, q; ‘+1 is the probability that C, outputs 1 when given as input the first 7 bits of a 
sequence in es followed by the incorrect i + 1st bit, followed by P(k) — (i+ 1) random 
yee Thus, pi*? is just the average of all values of r, taken over all i-bit prefixes and 
qi! is, similarly, the average of all values of 1;. If we could calculate each r, and 1; 
exactly, our next-bit circuit would predict the 7+ 1st bit correctly with probability 

pit} . From above, we know that: 


a 8 2 
py — git > 
- "E =& P(RQ(k) 


where P and Q are both polynomials. Thus, if each of the estimates for r, and 1; is 
off by less than 1/(P(k)Q(k)) then the resulting estimate for pi*! will still be greater 
than the estimate for git}. So, the next-bit circuit will still succeed more than half the 
time. 

We must therefore show that with only Poly(k) trials, we can get estimates for 
r, and r; which are off by less than 1/(P(k)Q(k)). This is necessary to insure that the 
next bit circuit has size polynomial in k as required. 

Let p be a probability which is estimated by a value p, calculated by performing ¢ 
trials of the sampling procedure described above. Then, by the Central Limit Theorem 
(see [HPS]) we have: 

Prob(|p — p| > ¢) = 2(1 — 9(6)) 


where 


—§ = 2evVt 
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and a 
98 
G(z) =| —e* ?du, 
~*° SOx | 
Note that we can never be certain our estimate is within the bound c but only sure 
with arbitrarily high probability. If we wish to know how many trials are necessary in 
order to be sure with probability ¢ that our estimate is within the bound c, we must 
solve the following equation for ¢t: 


21 — P(2cVi)) < €. (1) 


In fact, when our procedure is converted into a circuit, we will be able to “build-in” 
a particular set of ¢ strings of bits which can be used in sampling to assure that our 
estimates are off by less than 1/(P(k)Q(k)). By calculations which can be found in the 
Appendix, equation (1) above is satisfied by: 


In 2 — Ine 


t 
eo 2c? 


As discussed above, we want c = 1/(P(k)Q(k)) and, for reasons which will be made 
clear below, we set € == 1/27); 


> F(In2— na) (PCRYQA))?). 
This is certainly satisfied by: . 
t > (P(k))?(Q(k))? + (PCA))*(Q(A))?. 


We will now show that if the number of trials ¢ satisfies this inequality then 
the next-bit circuit can actually be built in such a way that, given any 72-bit prefix 
of a string in S; as input, the circuit can, with certainty, estimate r, and 7; with 
an error of less than 1/(P(k)Q(k)). The argument is probabilistic. Suppose that 
t = (P(k))?(Q(k))? + (P(k))?(Q(k))* and let W = {u1,...,w} be a set of random 
strings each of length P(k) — (i+ 1). In other words each element w, of W is a string 
of random bits: 

w; = b42, bi43, +--+, bp(E)- 


These strings will be used to do the sampling. Let z be a randomly chosen 1-bit prefix 
of a string in S;. We will say that the set W fails on z if, given z as input, at least one of 
the probabilities r, and r; cannot be estimated with an error of less than 1/(P(k)Q(k)) 
when using all of the strings in W to do the sampling. Since the chance of failing on 
either r, or 7; is less than 1/2?(*) (because « = 1/27(*)), we get: 


Prob[W fails on a random 1-bit prefix of a string in S;] < = 
: 2 
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Since there are at most 2° ¢-bit prefixes: 


2. 
Prob[W fails on at least one +-bit prefix of a string in S;,] < mrad 
, 2 


Since the next-bit circuit is trying to predict the 1+ 1st bit and strings in S, have 
length P(k), we know that 7 can be at most P(k) — 1. Therefore, 


Prob[W fails on at least one i-bit prefix of a string in S;] < 1. 


So, there must exist at least one set W of t¢ strings which does not fail on any inputs to 
the next-bit circuit. This set W will be “built into” the next-bit circuit, assuring that 
this circuit can always estimate the probabilities r, and r; with sufficient accuracy to 
be sure of predicting correctly more than half the time. Since ¢ is polynomial in k, the 
next-bit circuit will have size polynomial in k as required. A complete algorithm for 
computing the next-bit is given on the next page. It should be clear that this procedure 
can be converted into a polynomial size next-bit test. 


Algorithm A3 


input: an 72-bit prefix of a string z € Sy given as by,...,0; 
output: y = 7+ Ist bit of x (with probability } +=). for some polynomial R(k)) 


R(b) 
(1) Set county = 0 
(2) Set count; = 0 
(3) Set ¢ = (P(k))? (Q(k))? + (P(k))? (Q(A))? 
(4) For sample = 1 tot do 


_ (4.1) Choose next “built-in” input for sampling: 
bi+2) bi4+3,+--, prey 


(4.2) Set county = county + Cy{b,...,6;,0, bj42,.--, bprr) ) 
(4.3) Set count; = count; + Cy(by,...,6;,1,b49,.-., bprk) ) 
(5) Set qo = saunta 


(6) Setq, = county 


(7) Output y = 0 with probability ~4— 
Output y = 1 with probability oa 


This concludes the section on estimating the probabilities as well as the proof of 
Theorem 2. Since the output of any CSB generator passes all next bit tests, we have 
shown: : 

Corollary 2 Let G be a CSB generator whose output consists of the set S = Uy S 
as defined above. Then S passes all polynomial-size statistical tests. 


2.3 Simulating Probabilistic Computation 


_ It remains to explore the usefulness of CSB generators in simulating general 
probabilistic computation. It is here that the notion of statistical tests and the result 
of Theorem 2 will be helpful. First, we give two standard definitions from complexity 
theory. 

Definition The class DTIME(T(n)) denotes the family of languages accepted by 
deterministic Turing machines which halt after at most T(n) steps, where n is the 
input length. 

Definition A language L is in the class R (“Random polynomial time”) if and only if 
there exists a polynomial P and a Turing machine Mp such that: 


(i) Mp takes two inputs z and y;input 2 is to be checked for membership in L and the 
meaning of input y will become clear below. The machine Mg runs in time P(|z)). 


(ii) if z ¢ L then for all y such that |y] = P(|z|) we have Prob{ Mp(z, y)accepts} = 0. 
(iii) if  € L then for all y such that |y| = P(|z|) we have Prob{ Mp(z, y)accepts} = }. 


The input y in the above definition can be thought of as a random sequence of bits 
or coin flips which may witness the membership of z in L. For the sake of clarity in our 
definition, the string y is given to Mp as input and the machine then computes a result 
deterministically which depends only on z and y. Intuitively, one may also choose 
to think of Mr as being given only z and actually generating the string y through 
some random selection process such as coin flips. In this setting, Mr is considered a 
“probabilistic” machine. Given some input z, if Mp outputs 1, we know for certain 
that z € L but if Mp outputs 0 we know only that z ¢ L with probability }. The 
computation of Mp on z can be repeated a number of times so that we can be sure of 
our answer with arbitrarily high probability. 


In terms of our original definition, if we run Mp(z, y) for each of the 2?!) possible 
values of y and the machine outputs 0 every time, then we know with total certainty 
that + ¢ DL. Thus, we have essentially shown the following: 


Fact R C Up>9 DTIMK(2"). 


Proof Let L € R be accepted by a probabilistic machine Mp in time P(n) where P 
is a polynomial. The language L can be accepted by a deterministic machine M as 
follows. On input z machine M runs Mpa(z,y) on each of the 2° (lz!) possible strings 
y of length equal to P(|z|). The machine M accepts z if and only if there exists 
some value of y such that Mp(z, y) accepts. Otherwise M rejects. It should be clear 
that M acepts exactly the language L in time 2¢(l#)) for some peymomiel Q. Thus 
L € Ucso DTIME(2"). 


The class U,s 9 DTIME(2") is also known as “exponential time”. Problems. 
requiring exponential time are considered intractable since the cost of solving them 
by computer quickly becomes prohibitive. If we could somehow avoid having to test 
every string y of length P(|z|) in our simulation of machine Mp, this simulation could 
obviously be sped up. A natural question at this point is whether the bit sequences 
output by a CSB generator are sufficiently “random” to be useful in this context. Yao 
has answered this question affirmatively by proving that such a generator can in fact 
be used to deterministically simulate a probabilistic machine and that the simulation 
requires only “sub-exponential” time. 


Theorem 3 (Yao[Y]) Suppose that for any polynomial P, it is possible to construct 
a CSB generator which takes k-bit inputs and produces P(k)-bit outputs. Then 
R C f\e>9 DTIME(2"). 

Proof Let L € R be accepted by a probabilistic machine Mr. Suppose that Mp runs 
in time n/ where nis the input length. We construct a deterministic machine M which 
accepts L in sub-exponential time. In the previous, exponential-time simulation, the 
machine M whould have tried every possible witness y of length n? in simulating the 
probabilistic machine. There are 2”” such strings which is what forces the exponential 


time bound. Here, M will only try those witnesses of length n? which are output by 
the CSB generator. Suppose, for example that the generator takes k-bit inputs and 
produces k°-bit outputs. Thus, an input to the generator (a “seed”) of length nile will 
produce an output of length nJ. The deterministic machine M inputs each string of 
~ length n//¢ to the generator producing a possible witness y of length n?. M simulates 
Mp(z, y) for each such y and accepts if and only if Mp ever accepts. Since there are _ 


most 2” ide seeds of length nJ/¢ , the entire simulation can be carried out in time gnil* 
for some fixed d. By assiiaption, the generator can be constructed to “stretch” it 
inputs by any polynomial amount. Here, this means that the value of c can be eee 
to be any fixed value. Thus, for any €, the simulation can be carried out in time Qn" 


Note that in our original definition of CSB generators (Lemma 1.2), we assumed 
only that each seed could be found in probabilistic polynomial time. Here, where we — 
want to use the CSB generator for a deterministic alcorithm, we simply try every string 
of length n’/¢ as an input to the generator. Some of these might not be actual seeds 
for the generator, hence the resulting output might not be a pseudo-random sequence. 
However, this method will cover every possible seed which is all that is necessary here. 


It remains to show that the machine M accepts the same language L which is 
accepted by’ Mr. As usual, let J, be the set of inputs to the generator of length k 
and let S, = {s,|z € I}. We will show that if M does not accept L, then the set 
S = U;, S, output by the generator fails a polynomial-size statistical test. The relative 
ease of this proof should make clear the reason for introducing statistical tests as 
criteria for our generators. 


Recall that if 2 € L then for a randomly chosen string y, Mp(z, y) accepts with 
probability }. If  ¢ L, then for no string y does Mp(z,y) to accept. So if M does 
not accept the language L, there must be some z € L which M incorrectly rejects 
(ie., none of the possible witnesses w output by the CSB generator cause Mp(z, w) to 
accept even though z € L.) Furthermore, there must exist infinitely many such strings 
z€Lor else M could be repaired without increasing its running time. 

Let z be any element of L which M fails to accept. Recall that the probabilistic 
machine Mp runs in time n? where n is input length. So, on input z, the deterministic 
machine M will simulate Mp(z,y) on all strings y of length |z|? output by the CSB 
generator. Thus M will have to. try all seeds of length |z|’/* since the generator 
“stretches” k-bit inputs into k°-bit outputs. For the remainder of the proof, we will 
write k in place of |z|?/¢. 

For each of the infinitely many such strings z, there will be a circuit C, with 
ke = (z[7/*)¢ = \z|7 inputs which has z “hard-wired” in. On any input y, the circuit 
C;, simulates Mp(z,y). This circuit outputs 1 if and only if Mp(z, y) acepts. 

Since y € L we have, by the definition of R, for a randomly chosen string y of 
length |z|?: 


ey 
Il 
bo! bos 


pf = Prob{C,(y) = 


However, since Mp(z,y) does not accept for any y € St for a randomly chosen y € Sy 
we have: 
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pe = Prob{Cy(y) = 1} = 0. 


Thus pe - pe = 5 so there is certainly a polynomial R such that pi — p? > Ry" 
Therefore, the set S fails the statistical test {C,} contradicting our assumption that 
S is the output of a CSB generator. This concludes the proof. 


Recall now. Theorem 1 which said that given any weak one-way function, it is 
possible to construct a CSB generator which “stretches” its seeds by any polynomial 
amount. Combined with Theorem 3, we get: 


Corollary 3 If there exists a weak one-way function, then R C (\,59 DTIME(2"). 
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Chapter 3 


Refinements of Yao’s Theorems 


In this chapter we refine somewhat Yao’s results presented in Chapter 2. More 
specifically, we assume the existence of two new kinds of one-way function, both in turn 
different from th2 weak one-way function described in Chapter 2. We then explore the 
types of pseudo-random bit generators which can be constructed from these functions. 


3.1 A Low-Level Refinement 


Recall that in Chapter 2 we defined a weak one-way function as being polynomial- 
time computable but difficult to invert by any family of polynomial-size circuits. Here, 
we consider an even more restricted definition. Informally, an R-weak one-way function 
is polynomial-time computable but difficult to invert by any family of R(k)-size circuits 
where R(k) is some polynomial. A rigorous definition of these functions follows the 
definition of “O notation”: 


Definition Let f(n) and g(n) be functions. We say that: 
f(n) = O(g(n)) 


if there exists some constant ¢ such that for all sufficiently large values of n: 
f(n) < eg(n). 


Definition Let J, be the set of all k-bit strings, let D, C J, and let f,:D, ++ Dy bea 

sequence of permutations. We will write D = UD, and f = {f,}. Let R(k) be some 

polynomial. Then, f is an R-weak one-way function (or simply an R-one-way function) 

if the following properties are satisfied: 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, selects an z € D, with uniform probability. 

(2) There exists a polynomial-time algorithm which, on inputs k and z € D,;, computes 
fi(z). ; 

(3) There exists a polynomial Q such that the following holds. Let C = {C,} be 
any family of circuits where each C; has k inputs and size O(R(k)). Then for all 
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sufficiently large k: 


C,(z) ~ f;'(z) for at least a fraction _ of the z € D,. 


Just as we did in Chapter 2, it is possible to construct a pseudo-random bit 
generator from an R-one-way function. However, as might be expected, the output 
from such a generator cannot be expected to pass all polynomial-size next bit tests. 
Instead, we need a weaker notion of such a test, called an RQ-nezt-bit test. Intuitively, 
a set of strings passes an RQ-next-bit test if no family of size Rik) circuits can, 
infinitely often, predict the next bit with probability at least } s+ an at" 


Definition Let P, Q and R be fixed polynomials, 5; a multiset consisting of P(k)-long 
bit sequences and S = Uj, Sy. An RQ-next-bit test for S is a family of circuits 
C= {Ci}. Each circuit Ci, has 7 Boolean inputs where 1 < P(k), one Boolean output 
and size O(R(k)). On input the first 1 bits of a sequence s randomly selected from Sj, 

; will output a bit 6. Let rf; ; denote the probability that b = the ++ Ist bit of s. We 
oy that S passes the test C if for all sufficiently large k and all i < P(k): 


1 
Q(k)’ 


We will refer to both C and Ci, as RQ-next-bit tests. 


Note the differences between this definition and the definition of general next-bit tests 
given in Chapter 2. Here, we restrict both the size of the circuit doing the predicting, 
as well as the degree of accuracy to which it can predict. In general, we will expect that 
the polynomial R, the size of the next-bit circuit, will depend on the poynomial Q, the 
accuracy threshold. The reason for this will be made clear below. Finally, we define 
the analogous notion of the CSB generator, an RQ-weak pseudo-random bit generator: 


Definition Let P, Q and R be fixed polynomials, J, the set of all strings of length k, 
and D, C J, aset of inputs (or “seeds”) of length k. Let G be a deterministic algorithm 
which, on input a seed z € D,, outputs a P(k)-long bit sequence s, in Poly(k) time. Let 
S, = {s,|z € D,}. The algorithm G is a Cryptographically RQ-weak pseudo-random 
bit generator (or simply an RQ-weak generator) if the multiset S = U; 5, passes every 
RQ-next-bit test. 

Recall that in Chapter 2 we showed, given a function which was difficult to invert 
with any polynomial-size circuit, how to construct a bit generator whose output passed 
all polynomial-size next-bit tests. Here, given a function which is difficult to invert 
with some fixed polynomial-size circuit, it would be nice to construct a generator 
whose output passed all next-bit tests of some, possibly different, fixed polynomial size. 
Unfortunately, using the techniques of Chapter 2, this does not seem possible. The 
logic behind the proof in Chapter 2 wes ai if there existed a polynomial-size next-bit 
test which succeeded with probability 3 p+ at Py for some polynomial Q, then it would be 
possible to construct a polynomial-size circuit to invert the assumed one-way function 
on most inputs. Unfortunately, the size of this circuit depended on the polynomial 


res <- ston 
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Q(k). Thus if we assume the existence of a U-one-way function, it is only possible, 
using these methods, to construct a bit generator whose output passes all next-bit 
tests of some fixed size which succeed with probability less than 4 + aw for some fired 
polynomial Q. We now state the main theorem of this section: 


Theorem 4 Let J; be the set of all strings of length k and let f = {f,} be a U-one-way 
function over some accessible domain E, C J. Thus, there exists a constant d such 
that, given any family of circuits C = {C,} where each C; has k Boolean inputs and 
size O(U(k)) we have, for all sufficiently large k: 


Ci(z) 4 f;,'(z) for at least a fraction — of the z € Dg. 


a 
By definition, f,(z) can be computed in polynomial time for all z € E, thus by a 
standard result from complexity theory (see [FP]) we may assume that there exists 
some polynomial F(k) such that for each value of k, f,(x) can be computed by a circuit 
of size F(k) for all x € E;. Then, for any polynomial P, it is possible to construct an 
RQ-weak generator which “stretches” seeds of length k into outputs of length P(k), 


where: 
R(k) = ol ey — F(k)k?4 — Pri’) 


Proof We first show a set of conditions, very similar to those given in Lemma 1.1 
which are sufficient for constructing weak generators. We chen. show how to satisfy 
these conditions, given any U-one-way function. 

Lemma 4.1 Let J, be the set of k-bit strings and let Dp, C Jy. Let g,:D, ++ Dy be 
a sequence of permutations and let B,:D, ++ {0,1} be a sequence of predicates. We 
will write D = U, Dy, g = {9,} and B = {B,}. Let P(k) be any polynomial. If the 
following set of conditions hold, then it is possible to construct an RQ-weak generator 
which “stretches” k-bit seeds into P(k)-bit outputs. The polynomial R is equal to: 


RK) = T(k) — P(H)size(Bi(ou) 


where T(k), and size(B,(g;)) are defined below. 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, chooses z € D, with uniform probability. 

(2) There exists a polynomial-time algorithm which, on inputs k and z € D,, computes 
94(). 

(3) There exists a polynomial-time algorithm which, on inputs k and z € D;, computes 
By (9x(z)). Thus, By(9,(x)) can be computed by a polynomial-size circuit. Let this 
circuit have size stze(B,(g;)). 

(4) Let C = {C,} be any family of circuits such that each C, has k inputs and size 
O(T(k)) for some fixed polynomial T,and let Q be some fixed polynomial. Then, 
for all sufficiently large k: 


C(x} 4 B,(z) for at least a fraction — of the z € D,. 


oe 
2 Q(k) 
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Note that the polynomial T may well be a function of. the polynomial @ thus, as 
expected, R will probably depend on Q. 


Proof of Lemma 4.1 First we construct the weak generator and then prove that its 

outputs must pass every RQ-next-bit test. The proof is very similar to that of Lemma 

1.1. 
Choose an appropriate value of k to be the seed length and choose in probabilistic 


polynomial time a random 2 € D, to be used as the seed. Set c = P(k), the desired 
length of the output sequence, and generate the bits: 


By(94(2)), Be(9e(z)),- +» Be(9i(2))- 


The notation gi indicates the j-fold composition of g,. Now, output these bits in 
reverse order, i.e.: 


By(9%(z)); By (gf *(2)), see By(94(z)). 


It should be clear that all of this can be accomplished in polynomial time by conditions 
(2) and (3). . 

It remains to show that the sequences output by this generator pass every 
RQ-next-bit tests where: 


R(k) = T(k) — P(k)size(By(gn)) 


where P(k) is the amount thatthe generator “stretches” k-bit seeds, size(By(g,)) is the 
minimum size circuit computing B,(g,(z)), and the predicate By cannot be computed 


with probability at least } + ati by any circuit of size O(T(k)). 


Suppose that this is not true. Then there exists a family of circuits C = {Ci} where 
each circuit Ci, has t < P(k) inputs, size O(R(k)) and the following holds. For each of 
infinitely many values of k there exists some 7 such that: 


where pe; is the probability that the circuit Ci outputs the correct i + 1st bit of a 
sequence when given the first z bits as input. 


We now construct a family of circuits A = {A,} where each A; has k inputs, 
size O(T(k)) and such that for infinitely many values of k, A, correctly computes the 


predicate B,(z) with probability at least 4 + aH This contradicts condition (4). 


Choose one of the infinitely many values of k such that for some t < P(k): 


The circuit Aj uses Ci as a “subroutine”. On input z € Dy, the circuit A, first 
generates the 1-bit sequence: 


By(9i(z)), Blok "(2)), ---» Be(ge()) 


and inputs this sequence to the next-bit test circuit Ci. This can be accomplished 
with O(7(stze(B;,(gx)))) circuit gates by condition (3). The circuit A, then outputs 
whatever value Ci, outputs on these bits making the total size of A, equal to: 


stze(Ay) = O( 1(stze(By(gx))) + R(k)) 
O( i(size(Bi(9x))) + T(k) — P(k)(size(Bi(9e))) 
O(T(k))(since t < P(k)) 


By showing that the circuit A; solves the predicate B, with probability at least 4 5 + au 
we get a contradiction to condition (4). 
Note that the bits: 


By(9i(2)), Be(gi *(2)),---» Be(ge(2)) 
are the first 2 bits of the CSB sequence: 


By(oi(2)), +++, Be(9e(2)), Be(z),---» Be(9i “(2)). 


Since the 7 + Ist bit of this sequence is B,(z), the circuit A, will correctly compute 
B,(z) whenever the circuit Ci, correctly outputs the 1 + 1st bit. Furthermore, the seed 
of this pemuenes is gi (2) ad since g; is, by assumption, a permutation, so is go = 
Thus, 9 i-c—1 penerates all possible seeds z € D, which means that the next-bit circuit 

i will ‘correctly output the 7+ 1st bit of this particular sequence for a econ at 
least 3 3+ Ob: Thus, the circuit A, computes B,(x) for a fraction at least } 5+ ate of 
the z € D, and has size O(T(k)) contradicting condition (4). This completes the proof 


of the lemma. 


We must now show that given any U-weak one-way function over an accessible domain 
E = U; Ej, it is possible to construct a new domain D = U, D,, a function g = {9;} 
and a predicate B = {B,} which satisfy the four conditions of Lemma 4.1. 

Lemma 4.2 Let f = {f,} be a U-weak one-way function over an accessible domain 
E = VU, Ey. Then, by definition, there exists some constant d such that given any 
family of circuits C = {C,} where C, has k inputs and size O(U(k)), we have, for all 
sufficiently large k: 


Ci(z) ¢ f,°(z) for at least a fraction a of the z € D,. 


Furthermore, assume that f,(z) can be computed by a circuit of size F(k). The 
following construction satisfies conditions (1)-(4) of lemma 4.1: 
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(a) Set D, to be the cartesian product of k?4 copies of E,. It will be less cumbersome 
to write ck instead of k?4 so we will generally choose to do this, as in Chapter 2. 
Thus, D, is the cartesian product of ck copies of E; where c = k?4~!, Formally: 


Dy = { (21, 22,-++; Zee) | 21 € Eg,--+s Zen € Ey}. 


(b) Let gx( (x1, 22,..-,2en)) = ( fe (21), fe(z2),.--, fe(%ez) ) where each 2; € Ey. 
(cl) Let Bi(z) = the ith bit of f;1(z) where z € Ey. | 
(c2) Let 


= 


, k oe ; 
B,((z1, 22,---, Zek)) = ® ® By (Ze(¢-1)+;) 
j= 


where each z; € D; and “@” denotes the “exclusive-or” operation. 
3 P 


Proof of Lemma 4.2 It should be fairly obvious that conditions (1), (2) and (3) of 
Lemma 4.1 are satisfied by this construction. The domain D, is certainly accessible 
since, by assumption, E; is accessible. Given any (21,22,...,2-4) € Ey, the function 
gk( (21, 22,..+,2e%)) can be computed in polynomial time since, by assumption, each 
fx(z;) is computable in polynomial time. Finally, the predicate: 


By( gx (21, 22,.--,Zek)) = Be( (fe(21); fe (22), +++» fe(zek))) 
koe . 
=@ ® By fe(Zee—1)+3)) 


i=1 j= 
can easily be computed in polynomial time since for any z € D,: 


Bi( fe(z)) = the sth bit of f7"(f;,(z)) 
: = the ith bit of z 


In fact, for any (21,22,..., 2%) € Dy, the predicate By(((z1,22,...,2ek))) can be 
computed by acircuit of size O(ck) = O(k?4) since it simply consists of k?4 exclusive-ors. 
Thus, the value here of stze(By,(g,)) in condition (3) above is: 


size(By(gx)) = O(k™). 


The more difficult task is to prove that, as constructed, the predicate By, satisfies 
condition (4) of Lemma 4.1. In particular, we must show that for an appropriately 
chosen polynomial T(k) the following holds. Let C = {C,} be any family of circuits 
such that each circuit C, takes inputs from the domain D, constructed in (a) above. 
and has size O(T(k)). Let T(k) be set to: 


T(k) = aH — F(k)k?4, 


were Q(k) is a polynomial. We will show that for every such family of circuits and all 
sufficiently large k: 


Ce((21,22,.-+)%ek)) A Be( (x1, 22,--+ tek)) 
for at least a fraction b - a of the (11, 22,...,%ck) € Dx 


2 Oe) 


Actually, we will choose to formulate this problem slightly differently. For every such 
family of circuits {C,} and every we will show, for all sufficiently large k: 


1 
Qk) 


where (71, 22,.. ,%¢&) is a randomly chosen element of Dy. This will be proven through 
a sequence of two lemmas. First, we repeat the definition of the probability sf, defined 
in Chapter 2. For details on what a “group” is, see page 15. 


Prob[ Cx( (21, Z2,..-, Lek) ) = By (21, %2,...,2ek))] < = ston 


Definition Let C = {C;,} be a family of circuits where each C;, takes inputs from the 
domain D,. Think of the circuit C, as trying to compute the predicate B,. Now, for 
each x € Ej define the probability sf,(z) as follows: 


sf, (2) =Prob[C,((z1, 22,..-, ek) ) == By( (21, 22,.--, Zek))] 
where (21, 29,..-,2¢4) is randomly drawn from the elements of Ey 
which have z in the 7th group. 


Lemma 4.2.1 Let C = {C,} be a family of circuits where each circuit C; takes inputs 
from the domain D,. Suppose that there exists a polynomial V(k) and a constant d 
such that for all sufficiently large k we have, for a randomly chosen z € Ey: 


Prob] s;” C (2) > s+ +75 for every value of 1 from 1 to t| > >1- a 


If each circuit C; has size: 


of ny - er) 


Then, there exists a family of circuits A = {A,} where each A, has k Boolean inputs, 
size O(U(k) and for all sufficiently large k: 


A,(z) = f;}(z) for at least a fraction 1 — - of the 2 € Ey. 


Proof The main details of the proof are identical to the proof of Lemma 1.2.2 
in Chapter 2. The only. remaining step necessary here is to show that, under the 
assumptions of this lemma, Algorithm A2 on page 20 can be converted into a circuit 
A, of size O(U(k)) We reproduce this algorithm on the next page. 
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Algorithm A2 


input: z € E 
output: y € E, such that y = f i (2) with probability 1 — a 


(1) Set y = null 
(2) For h = 1 to k do 
(2.1) Set count = 0 
(2.2) Set t = 6kV(k) 
(2.3) For sample = 1 to t do 


(2.3.1) Choose next “built-in” input for sampling: 
Ti, %2)+.+ 5 Ze(h—1)s Ve(h—1)+21°°°9 Xek 


(2.3.2) Set u = Cy( (fe(z1),-- +5 Se(te(n—1y)s 2s Se(Ze(n—1)+2))-+ ++ Se(Zek)) ) 


(2.3.3) Set v= @f-} @jui(ith bit of z4-1)4,) 
Dyno (Ath bit of 2.(4_1)45) 
Digs Dfai(sth bit of z.451)43) 


(2.3.4) Set count = u @ v 


(2.4) Set avg = — 


(2.5) Favg > 44+ We then set w = 1 


(2.6) Elseif aug < 4, — Ww then set w =0 


(2.7) Set y = wo y (o denotes concatenation) 


(3) Output y 
Let us analyze the “circuit size” of this algorithm. There are two loops, one of k 
iterations (step (2)) and one of ¢ iterations (step (2.3)). Inside the inner loop, steps 
(2.3.1) and (2.3.3) each require O(ck) circuit gates and step (2.3.2) requires size: 


O( ckF(k) + stze(C;)) 


where F(k) is the assumed circuit size for computing f, and size(C;) is the size of 
the circuit C, which is computing the predicate B,. Thus the entire algorithm can be 
converted to a circuit A, which computes f re with probability at least 1— a and has 
size: 


O( kt[ekF(k) + size(Ci)]). 
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The constant ¢ has been defined to be k?¢! and from step (2.2) we see that t = 
O(kV(k)). Also, by assumption, we know that: 


size(C) = of aa _ rte) 


Thus, the circuit A; has size: 


Cac etree + eats - r(e)) = O(U(k)). 


This completes the proof of the lemma. Since f; is assumed to be a U-one-way function, 
we have actually shown: 


Corollary 4:2.1 Let C = {C,} be any family of circuits where each C, takes inputs 
from the domain D, and has size: 


of aay - FH), 


Then for all sufficiently large k we have, for a randomly chosen z € Ej: 


Prob sf )> a for every value of § from 1 to 7 <1-G. 


We need one more lemma to finish the proof of Theorem 4. 


Lemma 4.2.2 Let C = {C,} be any family of circuits where each Cy takes inputs 
from the domain D, and has size: 


of Bat - FH) 


Then, for a randomly chosen (21, 22,...; Ick) € Dy we have, for all sufficiently large k: 


Prob[ CO;( (1, 22,..-,Zck) ) = By (21,22). -1ek))] <5 ; 2* Ok) oy 


Proof Suppose first that some family of circuits C = {C;} has the property that each 
C;, takes inputs from the domain D, and has size: 


of or : Ph) (1) 


for some fixed polynomial V(k). Then, from Corollary 4.2.1, for all sufficiently large k 
we have, for a randomly chosen z € Ey: 


Prob of < ; + 76 for some value of : from 1 to | > a 
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From Lemma 1.2.3 of Chapter 2, it then follows that there exists a constant a and 
polynomial Q such that V(k) = aQ(k) and: 


Prob[ C;( (x1, @Q,... »ck)) = Bil (x1, z9,- .+1Zek) )] <- - 2+ OR oy 


Replacing V(k) by aQ(k) in line (1) above gives: 


of aa - rte) 


which is just: 


of a a - FW) 


proving the lemma. 


Thus, the construction of Lemma 4.2 satisfies all four conditions of Lemma 4.1. 
For this construction we have that the quantity stze(B,(g,)) of condition (3) is, as 
noted above, equal to k?¢. As just proven in Lemma 4.2.2, the polynomial T(k) in 


condition (4) is equal to: 
U(k) 124 
| o( ath cela) 
Thus, the construction of Lemma 4.2 produces an RQ-weak generator where: 


R(k) = of a ~ K4F(k) — Pit) 


This completes the proof of Theorem 4. 
3.2 Fast Simulation of Probabilistic Computation 


We now explore a very much stronger notion of a one-way function. Namely, we consider 
functions which can be computed in polynomial time but whose inverse is difficult to 
compute with high probability by any circuits of some fixed sub-ezponential size. We 
show that if any such function exists, it is possible to build a bit generator which 
can be used to simulate polynomial-time probabilistic computation in deterministic 
polynomial! time. 

Definition Let I, be the set of all k-bit strings, let D, C J, and let f,:D, ++ Dy 
be a sequence of permutations. We will write D = UD, and f = {f,}. Then, f 
is a 2*°-weak one-way function (or. simply a oe” one-way function) if the following 
properties are satisfied: P 

(1) The domain is accessible: there exists a probabilistic polynomial-time algorithm 

which, on input k, selects an z € Dy with uniform probability. 


- (2) 
(3) 


There exists a polynomial-time algorithm which, on inputs k and z € D,, computes 
fi(z). 

There exists a polynomial Q and a fixed constant a < 1 such that the following 
holds. If C = {Cy} is a family of circuits where each C, has k inputs and size 
Poly(2**) then for all sufficiently large k: 


Ci(z) A f;1(z) for at least a fraction ——— of the x € Dg. 


1- 
Q(k) 
We now define an analagous next-bit test which corresponds to this new function. 
As might be expected, these next-bit tests will have sub-exponential size. Similarly, 
the analagous bit generator will pass all such next-bit tests. 


Definition Let a < 1 be a fixed constant, P be a polynomial, S, a multiset 
consisting of 2*"-long bit sequences and S = Uj; Sp. A 2**-nezt-bit test for S is a 
family of circuits C = {Ci}. Each circuit Ci has i Boolean inputs where i < 2*", 
one Boolean output and size Poly(2¥). On input the first i bits of a sequence s 
randomly selected from S;, Cj, will output a bit b. Let rf; - denote the probability 
that b = the 2 + Ist bit of s. We say that S passes the test C if for all malicleniny 
large k and alli < 2*°: 

Pes Ss ston 


CO} 

We will refer to both C and C} as - -next-bit tests. 

Definition Let a < 1 be a fixed constant, J, the set of all strings of length k, 
and D, C I, a set.of inputs (or “seeds”) of length k. Let G be a deterministic 


algorithm which, on input a seed z € D,, outputs a 2*°-long bit sequence sz in 
Poly(2*° ) time. Let S, = {sz|z € Dy}. The algorithm G is a Cryptographically 


-2*" strong pseudo-random bit generator (or siraply a Qk -generator) if the multiset 


S =U, Sz passes all 2*° -next-bit tests. 


Definition The class P denotes those languages which can be accepted deter- 
ministically in time polynomial in the length of the input. 


We now state the main theorem of this section: 
Theorem 5 If any 2*"-one way function exists, then R C P. 


Proof The proof will proceed as follows. First we show that if any 2*°-one way 
function exists, then it is possible to build a 2** generator. Then we show that 
the sequences output by this generator can be used to simulate any probabilistic 
machine in deterministic polynomial time. First we give a set of conditions which 
is sufficient to build 2*"-generators. 

Lemma 5.1 Let a < 1 be a fixed constant, let I, be the set of k-bit strings and let 
Dx © Ig. Let 94:Dy ++ Dy be a sequence of permutations and let By:D, ++ {0,1} 
be a sequence of predicates. We will write D = U, Dx, g = {g9,} and B= = {B,}. If 
the following set of conditions hold, then it is possible to construct a 2*"-generator. 
The domain is accessible: there exists a probabilistic polynomial-time algorithm 
which, on input k, chooses z € D, with uniform probability. 
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(2) There exists a polynomial-time algorithm which, on inputs k and z € Dy, computes 
9%(2). 

(3) There exists a polynomial-time acon which, on inputs k and z € D,, computes 
Be(or(2)) | 

(4) ‘Let C = {C,} be any family of circuits such that each C, has k inputs and size 
Poly(2**). Let Q be any polynomial. Then, for all sufficiently large k: 


C,(z) 4 B,(z) for at least a fraction of the z € D,. 


1 
2 Qk) 
Proof of Lemma 5.1 First we construct the 2*"-CSB generator and then prove that 
its outputs must pass all 2*°-next-bit tests. The proof is very similar to that of Lemma 
4.1. 
Choose an appropriate value of k to be the seed length and choose in probabilistic 


polynomial time a random z € D, to be used as the seed. Set c = 2*", the desired 
length of the output sequence, and generate the bits: 


Bx(9%(2)), Be(%(z)),--- 1 Be(96(2))- 


The notation gh indicates the j-fold composition of gz. Now, Susp these bits in 
reverse order, i.e.: 


By(9i(z)), Belge “(z)), «++» Be(oe(z)). 
It should be clear that all of this can be accomplished in time Poly(2*’) by conditions 
(2) and (3). 
It remains to show that the sequences output by this generator pass all 2*"-next-bit 
tests. Suppose that this is not true. Then there exists a polynomial Q and a family of 


Poly(2*") size circuits C- = {Ci} where each Ci, has i < 2*” inputs and the following 
holds. For each of infinitely many values of k here exists some 7 such that: 


1 
pe a 
Pes 2 5+ 


where pe; is the probability that the circuit Ci, outputs the correct 1+ 1st bit of a 
sequence ‘when given the first z bits as input. 


We now construct a family of circuits A = {A,} where each a has k inputs, size 
Poly(2*") and such that for infinitely saad values of k, A, correctly computes the 
predicate B,(z) with probability at least 4 + ay: This contradicts condition (4). 


Choose one of the infinitely many values of k such that for some t < P(k): 
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The circuit A, uses ci as a “subroutine”. On input z € Dy, the circuit A, first 
generates the z-bit sequence: 


B,(9i.(2)); B,(9*(2)), -+ + Be(9x(z)) 


and inputs this sequence to the next-bit test circuit ct. This can be accomplished with 
O(1(Poly(k))) circuit gates since conditions (2) and (3) imply that B,(9,(z)) can be 
computed in polynomial size for all z € D,. Since i < 2*°, this whole process takes 
size: 

O(2*"(Poly(k))) = Poly(2*") 


The circuit A; then outputs whatever value Ci, outputs on these bits making the total 
size of A; equal to: 


Poly(2** + 2*°) = Poly(2*’). 
The proof that the circuit A, solves the predicate B, with probability at least } + a 
is given in Lemma 4.1 and so we omit it here. This completes the proof of the lemma. 


We now show that, given any 2*°-one-way function over an accessible domain E = 
Ux Ex, it is possible to construct a new domain D = Uj, Dy, a function g = {9,} and 
a predicate B = {B,} which satisfy the four conditions of Lemma 5.1. Since the proof 
is virtually identical to that of Lemma 4.2, we give very few details. 


Lemma 5.2 Let f = {f,} be a 2**_weak one-way function over an accessible domain 
E = U, Ey. Then, by definition, there exists some constant d such that given any 
family of circuits C =.{C,} where Cy has k inputs and size Poly(2*"), we have, for 
all sufficiently large k: 


C;(z) ¢ f; (x) for at least a fraction ti of the z € D,. 


The following construction satisfies conditions (1)-(4) of lemma 5.1: 


(a) Set D, to be the cartesian product of k74 copies of Ey. It will be less cumbersome 
to write ck instead of k4 so we will generally choose to do this, as in Chapter 2. 
Thus, D, is the cartesian product of ck copies of E, where c = k24-1, Formally: 


Dy = { (21, 22,..+,%ek) | 21 € Eg,..-, Zeke € Ex }- 


(b) Let ox ( (21, 20,...,2ck)) = ( fe (21), fe(z2),---» fe(tez) ) where each 2; € Ey. 
(cl) Let Bi(z) = the ith bit of f;1(z) where z € Ey. 
(c2) Let . 
k oe ; 
By((21,22,..-,2ek)) = ® ® By(Ze(-1)+5) 
F aie a Lam 
where each z; € D, and “@” denotes the “exclusive-or” operation. 


Proof of Lemma 5.2 It should be fairly obvious that conditions (1), (2) and (3) are 
satisfied by this construction (for more details, see the proof of Lemma 4.2). It remains 
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to show that for any family of circuits C = {C,} where each C;, takes inputs from the 
domain D, and has size Poly(2*") , we get for all sufficiently large k: 


Prob[C;( (21, 22,.--,2ck)) = By( (21, 22,.. “stek) | <5 ston a6 


where (21, 22,...,%k) is a randomly chosen element of D;,. We do this through a 
sequence of two further lemmas. 

Lemma 5.2.1 Let C = {C;} be a family of circuits where each circuit C; takes inputs 
from the domain D, and has size Poly(2*"). Suppose that there exists a polynomial 
V(k) and a constant d such that for all sufficiently large k we have, for a randomly 
chosen z € E,: 


1 
for every value of 1 from 1 to k| > 1— ja 


1 
Prob sf,(z) > st V(b) 


Then, there exists a family of circuits A = {A,} where each A, has k Boolean inputs, 
size Poly(2**), and for all sufficiently large k: 


A,(z) 7 f,1(z) for at least a fraction 1 — r of the z € Ey. 


Proof The main details of the proof are identical to the proof of Lemma 1.2.2 
in Chapter 2. The only remaining step necessary here is to show that, under the 
assumptions of this lemma, Algorithm A2 on page ? can be converted into a circuit A, 
of size Poly(2**). As we saw in Lemma 4.2, where Algorithm A2 is reproduced, this 
algorithm can be converted into a circuit A, of size: 


O( Poly(k)(size(C,))) 
where size(C;) is the size of the circuit C,. Since, by assumption, we have: 
size(C,) = Poly(2**) 
we know that A; has total size: 
O( Poly(k)Poly(2**)) = Poly(2*’) 


This finishes the proof of the lemma. Since jf; is assumed to be a 2*"-one-way function 
we have actually shown: 


Corollary 5.2.1 Let C = {C;} be any family of circuits where each C; takes inputs 
from the domain D, and has size Poly(2*"). Let V(k) be any polynomial. Then for all 
sufficiently large k we have, for a randomly chosen z € Ey: 


: for every value of + from 1 tok| << 1— ml 


Prob sf ,(Z) > + 7H : kd 


ce oy 


46 


We need one more lemma to complete the proof of Lemma 5.2. 


Lemma 5.2.2 Let C = {C,} be any family of circuits where each C; takes inputs 
from the domain D, and has size Poly(2*"). Let Q(k) be any polynomial. Then, for a 
randomly chosen (21, 22,...,Zex) € Dy we have, for all sufficiently large k: 


Prob[Cy((21,22,..+,2ek)) = Be( (x1, 22,.-.)2ek))] < 5 + Ow 


Proof Let V(k) be any polynomial. Then, from Corollary 5.2.1, for all sufficiently 
large k we have, for a randomly chosen z € E: 


Prob sf,(z) < = + ent for some value of + from 1 to k| > ne 


Q(k) ka 


Then, from Lemma 1.2.3, there exists a polynomial Q(k) = aR(k) for some constant 
a such that, for a random (x1, 22,..., 2x) € Dy: 


1 


Prob[Cx( (21,22, -.-,Zex)) = B,((z1,22,-..,Zek) )] < = + (hy 


This completes the proof of both Lemma 5.2.2 and Lemma 5.2. 


We now continue the proof of Theorem 5 by showing that the 2*"-generator which we 
have just constructed also passes all statistical tests of a type similar to those defined 
in Chapter 2. 


Definition Let a < 1 be a fixed constant, S, be a multiset consisting of o**-long 
bit sequences and S = U, S;. A 2**-statistical test for strings is a family of circuits 
C = {C,}. Each circuit C, has 2*° Boolean inputs, one Boolean output and size 
Poly(2**). The multiset S passes the test C if for every polynomial Q, and all sufficiently 
large k: 


1 
§ R 
—_ < nee 


where pe denotes the probability that C, outputs 1 on a randomly selected s € S; 
and pe denotes the probability that C, outputs 1 on a randomly selected P(k)-long 
bit sequence. We will refer to both C and C;, as statistical tests. 


We can now show that 2*°-generators pass all 2*"-statistical tests. 


Lemma 5.3 Let a < 1 be a fixed constant, S, a multiset consisting of 2*°-long bit 
sequences and S = \, Sx. If the set S passes all 2*°-next-bit tests then S passes all 
2** statistical tests. 


Proof The proof is very similar to the proof of Theorem 2 so many details will be 
omitted. So suppose that a multiset S = Uj; S; fails some 2* -statistical test. Then, 
by definition, there is a family circuits C = {Cy} such that C; has 2*° inputs and 
size Poly(2*° ) Furthermore, there exists a polynomial Q such that for infinitely many 
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values of k: 4 


O(k) 


where PE is the probability that C, outputs 1 on a randomly selected sequence in 
S, and p= is the probability that C, outputs 1 on a randomly selected 2* long bit 
sequence. We now show that S' must necessarily fail a 2*"-next-bit test. 


Let [z|z]y denote the concatenation of the first ¢ bits of z with y. For each i < 2", 
let pi denote the probability that C7([zl¢]y) = = 1 where z is chosen randomly from St 
and y is a random string of 2*° —¢ bits. Thus, Ph is the probability that the statistical 
test C, outputs 1 when given as on the first z i bits of some string in S; followed by 


2** — ¥ random bits. Note that p? = pf’ and ph = pf. 


We now construct a family of circuits A = {Aj} which constitute a next-bit test 
for S. Each Ai has i < 2*" inputs and size Poly(2** ). Let k be chosen such that: 


lp — pE| > 


1 
lpé — pf | > Q(k Q(k)’ 
Assume, without. loss of generality, that 
1 
pe — pe > alk)’ 


Then, clearly, there must be a value of ¢ such that 


pit! — pi > (aia (a=) 


(Remember, 7 varies between 0 and 2*".) The circuit Aj will correctly predict this. 
1+ 1st bit of any sequence in S; with high probability. 


Recall that pitt i is the probability that C;, outputs 1 on any string consisting of 

an + 1-bit initial segment of a string in S; followed by 2*° — (i +1) random bits. Let 
a be the probability that C, outputs 1 when given the first 1 bits of a string in S, 
followed by the incorrect i + 1st bit, followed by 2" — (i+ 1) random bits. Note that: 


thus: ; ; : : 
= — itl = ttl + ttl ese AP Soe 
pit P= PE ~ (Seb a 9 4k ) 2 (s5)(a) 


er es i ( 1 ) 
a oes >| — 
gk ~ 9% = \ Ok) NaF 


= ateaF a 
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50: 
pit} >4q i+] 


Consider in particular line (1). The fact that the difference between pi‘) and git? is 


large means that if C, is given as input the first 2 bits of some string s € S; followed 
by 2*° —¢ random bits, it can effectively distinguish between those strings containg 
the correct 1 + 1st bit of s in their 1+ 1st position, and those containing the incorrect 
z+ 1st bit. 


Since pit! + git! = 1 and pit! > git), if we can be sure of correctly predicting 
the 2+ ist bit of a randomly drawn string in S, with probability pit, then we will be 
correct more than half the time. The next-bit test At is constructed to do this. 


So, suppose Af is given as input by, bo,...,;, the first 7 bits of a randomly chosen 
sequence s € S,. Let b be the correct + Ist bit of s and let b be the incorrect ¢ + 1st 
bit. The circuit Aj first calculates the probabilities: 


te = Prob[ Cy(by,..., by, b, by42,..-) boxe) = 1] 


ee Prob C;(b1, eeey b;, 5, by42; coey boxe) = 1 


where b;,9,...,6g« are chosen randomly. In fact, these probabilities can only be 
estimated with high accuracy but for the sake of clarity, let us assume for a moment 
that they can be calculated exactly. A discussion on estimating the probabilities 
appears later in the proof. Note that A} does not actually know which is the correct 
and which the incorrect 1+ 1st bit. It simply calculates the two probabilities-one of 
them will be fe the other, r;. The predicting circuit AL now chooses 6;,; = b with 
probability ;- Pen and chooses 6;,,; = b with probability = “7 


‘Suppose we calculate the value of r, for each 1-bit prefix of a string in S,; and take 
the average of all these probabilities. This average is just the total probability with 
which the circuit Aj} correctly Dredicys the 1+ 1st bit of a randomly drawn string in S,. 
This average is also equal to pi‘! so Ai predicts correctly with probability pitt > }. 
Thus there must exist some polynomial R such that Aj correctly chooses the + + 1st 
bit for sequences in S, with probability at least } + RD" The set S therefore fails the 
next-bit test A = {Ai} and this part of the proof is finished. 

Estimating the probabilities r, and r; is done using the same sampling procedure 
described in the proof of Theorem 2. However, here we have from line (1) above that: 


2 
i+1 a+1 
Py — dQ, Pa Q(k)2** * 


Thus, we would like our estimates to be off by less than 1/(2*°Q(k)) and, for reasons 
which will become clear below, we want to be confident that these estimates are off by 
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this amount with probability: 
1/2. 

To get the number of trials t we therefore solve the following equation where c = 
1/(2*°Q(k)) and « = 1/2? : 
In2— Ine 

2c? 

1 _ok® 7 
> 5|ln2—In(2-?” )( 275" (Q(x)? |. 


t> 


This is certainly satisfied by: 
£ > 2 (Q(k))? +29" (Q(h)) 


We will now show that if the number of trials t satisfies this inequality then the 
next-bit circuit can actually be built in such a way that, given any 1-bit prefix of a string 
in S; as input, the circuit can, with certainty, estimate r, and 7; with an error of less than 
1/(2*° Q(k)). The argument is probabilistic. Suppose that t = 2?*" (Q(k))? + 23*° (Q(k))? 
and let W = {w,,...,w:} be a set of random strings each of length 2*° — (i +1). In 
other words each element w; of W is a string of random bits: 


Wi = bi40, 0543, bore. 


These strings will be used to do the sampling. Let z be a randomly chosen 71-bit prefix 
of a string in S;. We will say that the set W fails on x if, given z as input, at least one 
of the probabilities r, and r; cannot be estimated with an error of less than 1/(2*" Q(k)) 
when using all of the strings in W to do the sampling. Since the chance of failing on 


either r, or 1, is less than 1/ g7" (because € = 1/ 22"), we get: 


Prob[W fails on a random +bit prefix of a string in S,] < ae 
; 2 
Since there are at most 2° 7-bit prefixes: 
2. 
Prob|W fails on at least one +-bit prefix of a string in S,] < ere (2"). 
2 


Since the next-bit circuit is trying to predict the 1+ 1st bit and strings in S;, have 
length 2°, we know that ¢ can be at most 2*° — 1. Therefore, 


Prob[W fails on at least one :-bit prefix of a string in S,] < 1. 


So, there must exist at least one set W of t strings which does not fail on any inputs to 
the next-bit circuit. This set W will be “built into” the next-bit circuit, assuring that 
this circuit can always estimate the probabilities r, and r; with sufficient accuracy to 
be sure of predicting correctly more than half the time. Since t is polynomial in k, the 
next-bit circuit will have size polynomial in k as required. A complete algorithm for 
computing the next-bit is given on the next page. _ 
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’ Algorithm A3 


input: an 12-bit prefix of a string z € S;, given = b 05 
output: y = 7+ Ist bit of x (with probability } + RW for some polynomial R(k)) 


(1) Set county = 0 


(2) Set count; = 0 


) Set t = 29**(Q(&))? + + 23° (Q(k)Pn 
(4) For sample = 1totdo . 


(4.1) Choose next “built-in” input for sampling: 
bj+2, bj+3, . wey bore 


(4.2) Set countp = county + Cy(b1,...,5;, 0, bg2,-.-, bgee ) 
(4.3) Set count; SS ane + Cul br, «+4 Bi 1, bi 42). ++ boxe ) 
- (6) Set qy = “ne | 
(6) Setq. = Sp 


(7) Output y= 0 with probability —@ 
Output y = 1 with probability 


gor a 
atti 
It should be clear that this algorithm can be built into a next-bit circuit of size: 


O(t(size(C,)) 
where size(Cy) is the size of the circuit Cy. By assumption we have: 
size(C,) = Poly(2**) 
and from step (3) we see that: 
t = Poly(2**), 
Thus, we can build a next-bit circuit of size Poly(2**) which succeeds with probability 


at least 3 + A) for some polynomial R(k). This completes the proof of Lemma 5.3. 


Since, by definition, any 2*"-generator passes all 2*°-next-bit tests, we have actually 
shown: : 


Corollary 5.3 Let G be a 2*"-generator whose output consists of the set S = U; 5, 
as defined above. Then S passes all 2*’-statistical tests. 


We are now finally ready to show how 2*°-generators can be used to simulate 
probabilistic computation deterministically in polynomial time. 


Lemma 5.4 Suppose that a 2*°-generator exists. Then, R C P. 
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Proof Let L € R be accepted by a probabilistic machine Mp. Suppose that .Mpr 
runs in time n? where n is the input length. We construct a deterministic machine 
M which accepts L in polynomial time. On input z, machine M simulates Mp(z, y) 
using every possible ouput from the generator of length n? as a possible witness y. 
The machine M accepts z if and only if some output w from the generator causes 
Mp(z,w) to accept. By assumption, the generator “stretches” k-bit seeds into ae bit 
pseudo-random sequences. Thus, in order to get all pseudo-random sequences of length 
mn’ output by the generator, the deterministic machine M tries all inputs to the 


generator of length 
(7 log n)i/e, 


Also, the generator can output each sequence of length n’ in time Poly(n?) and the 
time required for each deterministic simulation of the probabilistic machine Mp is 
Poly(n’). The total time required for the entire simulation is therefore: 


Poly(ni)(2!8")"/") — Poly(ni)(n(t/ila"*) 
= Poly(n), 


Thus, the machine Mp can be simulated deterministically by M in polynomial time. - 


Note that in our original definition of 2*°-generators (Lemma 5.2), we assumed 
only that each seed could be found in probabthstic polynomial time. Here, where we 
want to use the generator for a deterministic algorithm, we simply try every string of: 
length (7 log n)!/ * as an input to the generator. Some of these might not be actual seeds 
for the generator, hence the resulting output might not be a pseudo-random sequence. 
However, this method will cover every possible seed which is all that is necessary here. 


It remains to show that the machine M accepts the same language L which is 
accepted by Mp. As usual, let J, be the set of inputs to the generator of length k 
and let S, = {s,|z € I,}. We will show that if M does not accept L, then the set 
S =U, 5S; output by the generator fails a 2*°-statistical test. 


Recall that if z € L then for a randomly chosen string y, Mp(z,y) accepts with 
probability . If x ¢ L, then for no string y does Mp(z, y) to accept. So if M does not 
accept the language L, there must be some z € L which M incorrectly rejects (i.e., 
none of the possible witnesses w output by the generator cause Mp(z,w) to accept 
even though z € L.) Furthermore, there must exist infinitely many such strings z € L 
or else M could be repaired without increasing its running time. 


Let z be any element of L which M fails to accept. Recall that the probabilistic 
machine Mp runs in time n’ where 7 is input length. So, on input z, the deterministic 
machine M will simulate Mp(z;y) on all strings y of length |z|? output by the CSB 
generator. Thus M will have to try all seeds of length (7 log(|z|))*/* since the generator 
“stretches” k-bit inputs into 2*°-bit outputs. For the remainder of the proof, we will 
write k in place of (7 log(|z{))!/*. 

For each of the infinitely many such strings z, there will be a circuit Cy with |z|? 
inputs which has z “hard-wired” in. On any input y, the circuit C;, simulates Mpr(z, y). 
This circuit outputs 1 if and only if Mp(z,y) acepts. Since Mp(z, y) runs in Poly(|z|) 
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time, the circuit C; has size Poly(lz|). Since: 
k = (7 log{|2[))'/* 


the circuit has size Poly(2**), 
Since y € L we have, by the definition of R, for a randomly chosen string y of length 
lz|?: 


1 
ptt = Prob{C;(y) = 1} = 3" 


However, since Mp(z, y) does not accept for any y € S;, for a randomly chosen y € Sy 
we have: 


pe = Prob{C;(y) = 1} =0. 
Thus pf — pe = 5 so there is certainly a polynomial R such that pf —p? > RD’ 


Therefore, the set S fails the 2*"-statistical test {C,} contradicting our assumption 
that S is the output of a 2*°-generator. This concludes the proof of both Lemma 5.4 
and Theorem 5. 
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Appendix 


In this appendix, we find a bound for t which satisfies the following inequality: 
2(1 — B(z)) <e 


where 
z= QcVt 
and 
4 
@(2)= fe“ a. 
—° / on 


First, we solve the following for z: 


{sf sate) <« 
~° Jln 


Rearranging and using the fact that f°, eee 2 du = /2n gives: 


[Pena 

| * Vin 2 

There is, unfortunately, no closed form solution to the integral on the left side. We 
can, however, bound the left side from above by something which is easy to integrate. 
Assuming that z is greater than or equal to 1, the value of z obtained from solving the 
following inequality will certainly satisfy the original inequality. We will address the 
fact that z must be greater than or equal to 1 below. 


[ued < = 
z 2 
Integrating gives: 
z? > 2(In2—Ine). 
Now, recall that z = 2c\/t which gives: 
In2—Ine 
mee 1 
aa (1) 
In order to bound our original integral, we had to assume that z be greater than or 
equal to 1. Since z = 2c\/t we get: 


t> 


acVt > 1 


which means: 1 
t>—. 2 
Now, since 0 < € < 1 we can satisfy both inequalities (1) and (2) by choosing t so 
that: 

In2—Ine 
> 


t 
2c? 
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